pytroll / pytroll-examples

Collection of examples for pytroll satellite data processing
GNU General Public License v3.0
76 stars 34 forks source link

Can't clone repository due to GitHub LFS limits #33

Open raybellwaves opened 4 years ago

raybellwaves commented 4 years ago

Clicking on http://binder.pangeo.io/v2/gh/pytroll/pytroll-examples/master

gives

Waiting for build to start...
Picked Git content provider.
Cloning into '/tmp/repo2dockeriatlrs7h'...
Downloading pyspectral/meteosat09_20150420_1000_snow_rgb.png (501 KB)
Error downloading object: pyspectral/meteosat09_20150420_1000_snow_rgb.png (df478b0): Smudge error: Error downloading pyspectral/meteosat09_20150420_1000_snow_rgb.png (df478b0d5729763467b306672d2dc178dafe6bf8a81a7182cfaa629f2e09e179): batch response: Git LFS is disabled for this repository.

Errors logged to /tmp/repo2dockeriatlrs7h/.git/lfs/logs/20200809T011053.553152168.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: pyspectral/meteosat09_20150420_1000_snow_rgb.png: smudge filter lfs failed
Error during build: Command '['git', 'reset', '--hard', '832949eadf74e5b392f9c3197c492a87b8205155']' returned non-zero exit status 128.

Looks like something to do with Git LFS Not sure if this would help https://filesystem-spec.readthedocs.io/en/latest/api.html#id0

djhoese commented 4 years ago

This is a known issue but somehow never made it into an official issue. Thanks for starting this. This qas actually brought up at the last monthly status meeting. On mobile but will fill in details next week.

djhoese commented 4 years ago

So the main issue is github's git lfs bandwidth limits. We originally switched to storing our resulting PNG images from notebook executions in this repository with Git LFS because it should have been more efficient and let us store more images as we add more examples. However, github has limits not only on how much you store, we aren't storing much (~6MB of 1GB limit), but also the amount people download (git clones, etc). So even though we don't have much stored, we are hitting the limits because of the number of people downloading the repository.

We've been thinking of moving back to storing the images as normal files in the repository, but have been considering remote storage at one of our institutions. Or I suppose we could just not store any of the resulting images. Any other ideas @raybellwaves?

raybellwaves commented 4 years ago

Without much thought, I like the idea of moving the images to a separate repo called pytroll-gallery. It could then be accompanied with something like rtd which displays the images on a website. Then put a note in this repo saying images generated can be viewed at ...

djhoese commented 4 years ago

I've really wanted to make a real gallery page for satpy and the other pytroll tools. We've had a lot of trouble with it because of how big our input datasets usually are, the time it takes to generate some of them, and we would have to downscale the resulting image if we wanted to reasonably store multiple of them. A separate repository may have to be the answer. We can then have the various projects point to the images, but it would be good if we automatically generated them with travis or something when possible.

mraspaud commented 4 years ago

The discussion on slack lead to a consensus about going towards sphinx-gallery. Now we just need to set it up...

djhoese commented 4 years ago

I'm not sure I'm fully convinced on moving everything to sphinx-gallery. If we do that then all examples have to be python scripts. This means no notebooks on binder hub which I think people really like. The other option was https://sphinx-nbexamples.readthedocs.io/en/latest/ or something similar.

raybellwaves commented 4 years ago

Slightly different error message in the Binder

Submodule 'tutorial-satpy-half-day' (git@github.com:pytroll/tutorial-satpy-half-day.git) registered for path 'tutorial-satpy-half-day'
Cloning into '/tmp/repo2docker7vgiub3d/tutorial-satpy-half-day'...
error: cannot run ssh: No such file or directory
fatal: unable to fork
fatal: clone of 'git@github.com:pytroll/tutorial-satpy-half-day.git' into submodule path '/tmp/repo2docker7vgiub3d/tutorial-satpy-half-day' failed
Failed to clone 'tutorial-satpy-half-day'. Retry scheduled
Cloning into '/tmp/repo2docker7vgiub3d/tutorial-satpy-half-day'...
error: cannot run ssh: No such file or directory
fatal: unable to fork
fatal: clone of 'git@github.com:pytroll/tutorial-satpy-half-day.git' into submodule path '/tmp/repo2docker7vgiub3d/tutorial-satpy-half-day' failed
Failed to clone 'tutorial-satpy-half-day' a second time, aborting
Error during build: Command '['git', 'submodule', 'update', '--init', '--recursive']' returned non-zero exit status 1.
djhoese commented 4 years ago

Interesting. So it seems that because I added the submodule with the ssh URL it requires ssh to exist and for you (the user) to also have that. Let me see if I can fix this.

djhoese commented 4 years ago

Can you try now?

raybellwaves commented 4 years ago

Thanks. Much further. Now got an error:

ResolvePackageNotFound:
  - intake::intake-xarray