lilab-bcb / cirrocumulus

Bring your single-cell data to life
https://cirrocumulus.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
71 stars 15 forks source link

Static website online dataset #211

Closed worsteggs closed 4 months ago

worsteggs commented 4 months ago

Is your feature request related to a problem? Please describe. When using Static Website, a datasets.json file is constructed as follows: [ { "id": "Amygdala", "name": "Amygdala", "url": "https://data.braincellatlas.org/mock/pbmc3k_no_raw.jsonl.idx.json" } ] Where url is the data source address, right? In the tutorial, this url is the local address, like "url": "pbmc3k/pbmc3k.jsonl" But I have a data that is very large, tens of gigabytes, and this is when I store this data on a public repository, such as my example above: https://data.braincellatlas.org/mock/pbmc3k_no_raw.jsonl.idx.json, which is a publicly and freely accessible data Link. My need is to be able to call the data online directly, rather than having to download access to it inside the project folder.

Translated with www.DeepL.com/Translator (free version)

Describe the solution you'd like As described above, Static Website can call online data, specifically the url in datasets.json can be specified as any data link that can be freely accessed by the public, rather than a file that the project depends on.

Describe alternatives you've considered A more specific scenario is as follows: currently we have a cirro Static Website deployed using GitHub Pages, and GitHub doesn't allow the storage of large files system, even a GitHub large file has a maximum file limit of only 5GB, which is much lower than the size of my actual data file. Therefore, if there is another way to implement Static build files that are deployed in GitHub Pages, but can call large files would be great solution. Currently the idea is as described above, to put the data files into a public repository and replace them with the url address in datasets.json.

Additional context Lastly thank you guys so much for your awesome work!

Yours, Zilch

joshua-gould commented 4 months ago

I believe this should already work. Note that if you host your dataset on a different server from which the static website is served, you will need to set the appropriate CORS headers.

worsteggs commented 4 months ago

Thanks! It was indeed a CORS issue, which I solved by setting Access-Control-Allow-Origin on the server. Thanks again for all your work!