Closed suryadheeshjith closed 7 months ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
I have asked @adam-subel to take a look at the notebook so he will let me know when he does. @jbusecke, currently this notebook uses large enough data that cannot be stored on this repository without using git lfs so I have opted to put it on Google Drive for now and download the data while running the notebook (See https://app.reviewnb.com/m2lines/data-gallery/pull/42/). Maybe, it would be better to put it on another repository and pull the data from there?
Hmm I would like to avoid storing things in different locations, since it will make maintenance a lot harder. It seems like the animations are the offenders here when it comes to size? Can we use a more efficient compression algo when creating those? I feel these short/small animations do not need to be ~20MB large?
The animation generation and size isn't really an issue here to be honest. I take care of that by skipping 10 states each time I need to update the frame in the animation. It is the data that is required i.e. ground truth (~800MB) and predictions (~1.6GB) that are large.
@suryadheeshjith + @jbusecke : what's the status of this. seems like there are some conflicts (and still some questions about the data hosting).
Ah got it I misunderstood the original issue then. Are you generating the notebooks (which need access to data) locally or as part of a gh action?
If the point of these notebooks is just for presentation, we do not need a data store since I can just run them locally and disable execution on jupyterbook. But, all the other notebooks can be run by someone else either locally, on binder or on leaphub. I think it would be useful to keep it consistent.
If the point of these notebooks is just for presentation, we do not need a data store since I can just run them locally and disable execution on jupyterbook. But, all the other notebooks can be run by someone else either locally, on binder or on leaphub. I think it would be useful to keep it consistent.
I agree with suryas point on consistency. The current setup is it compiles on the GitHub action machine and if its successful, it is published.
We don't compile locally as pushing the compiled html files is heavy again on the user side
also @suryadheeshjith yesterday I renamed the folder according to https://github.com/leap-stc/leap-stc.github.io/tree/main naming conventions - it magically fixed previews 🤔
When you pull it will complain a lot but its just basic renaming
@jbusecke Shall we move forward with what we have right now?
@jbusecke Based on our discussion today, I will move the data to leaphub and disable its execution on jupyter book for now (just like this notebook) by adding it to this list.
We can figure out using the leaphub service account for execution of these notebooks later.
We can figure out using the leaphub service account for execution of these notebooks later.
Sounds good to me. Can you open a targeted issue for this though? Just so we do not loose track of it later. Thanks
@jbusecke I have uploaded the data files on the storage bucket leap-persistent under the username 'data-gallery'. I hope this is okay.
@jbusecke I have uploaded the data files on the storage bucket leap-persistent under the username 'data-gallery'. I hope this is okay.
Would you mind moving it to m2lines-data-gallery
? Just to avoid any confusion between projects. And can you document this somewhere centrally as the one location to store data under?
Is there a way to move files across folders on leap? I can't find any resource for that on the technical documentation. Reuploading is kind of a pain. Solved using
fs.mv('leap-persistent/data-gallery', 'leap-persistent/m2lines-data-gallery', recursive=True)
@jbusecke It would be preferable to have the data to be read-only. Clearly, I can delete/move data files that need not be mine on leaphub.
Hey @suryadheeshjith I need this data mainly the plot folder available for our data gallery stuff - https://zenodo.org/records/8293998 for #41
maybe add a doc showing how to upload to the buckets, thanks
@IamShubhamGupto I have added a note in the README. I will anyway close this PR and start a fresh one so we don't need to bother with merge conflicts. @jbusecke regarding my previous comment, we can get back to it later. I will open an issue for it.
_toc.yml
and show up properly in website preview.