igmk / pamtra

Passive and Active Microwave TRAnsfer model
GNU General Public License v3.0
19 stars 16 forks source link

Data files not installed #9

Closed maahn closed 4 years ago

maahn commented 5 years ago

We need to come up with a way to provide and install the data (PAMTRA_DATA environment variable) files automatically. GIT LFS https://git-lfs.github.com/ might be an option but requires an additional dependency. Maybe an FTP directory is enough for the beginning?

DaveOri commented 4 years ago

I was exploring this git-lfs a bit today, I have even started a local branch, but suddenly I got stuck with the github quota system for LFS https://help.github.com/en/articles/about-storage-and-bandwidth-usage Apparently, with the basic plan we have 1 GB of file storage (pushes) per month and 1 GB of file bandwidth (pulls) per month. The first limitation is not that hard, the current size of datafiles on Mario's PC is 2.6 GB which basically means that in 3 months we could upload the data. The second is a harder limitation, from what I understand, a new user would not be able to pull the dataset all at once, it will require 3 months to do so, which is hardly acceptable. The first upgraded plan is 5$ per month and will give 50GB/month of traffic and storage, this would be better, but it is already not enough to keep pace with the current rate of repo cloning https://github.com/igmk/pamtra/graphs/traffic

What about dropping the version control on the data files and store them on a separate server with commands on how to download them in the installation documentation?

mariomech commented 4 years ago

I created a ftp user (pamdata) on our ftp server gop.meteo.uni-koeln.de . There I put all the data that can be downloaded now. It has no version control yet. So the issue is not yet completely solved. I'm not sure yet how to deal with the password. Should we make it open or should new users ask for tihe password? This would give us the possibility to keep track on users.

maahn commented 4 years ago

How about we add an error message if PAMTRA does not find the data files like: 'Please download data files from ftp://... to a directory of your choice and set the environmental variable $PAMTRA_DATADIR to the corresponding directory.'

mariomech commented 4 years ago

I included in the beginning of core.py a test for PAMTRA_DATADIR followed by instructions on how to get it from our ftp.