brainglobe / brainglobe-atlasapi

A lightweight python module to interact with atlases for systems neuroscience
https://brainglobe.info/documentation/brainglobe-atlasapi/index.html
BSD 3-Clause "New" or "Revised" License
127 stars 33 forks source link

Use pooch to download data #284

Open adamltyson opened 6 months ago

adamltyson commented 6 months ago

Pooch could be used to download:

This would provide many benefits, mostly in that we could remove all of our own code to do this, and it would also provide validation that atlases are properly downloaded (with checksums etc).

adamltyson commented 3 months ago

this should hopefully fix - https://github.com/brainglobe/brainglobe-atlasapi/issues/334

PolarBean commented 2 months ago

I support using pooch to download the atlases in the atlas creation scripts, but one benefit of using a centralised database of atlases is we avoid link rot. So many atlasing papers I have been reviewing recently have dead links to the data files. In many cases these datasets are completely lost. Maybe this is out of scope for the API though.

adamltyson commented 2 months ago

This issue is about using pooch to:

It only replaces the ad-hoc approach of using different download functions. We will still store our own central copy of all (repackaged) atlases on GIN (and in the future, likely mirrored elsewhere).

PolarBean commented 2 months ago

ah that makes sense then!

NicoKiaru commented 2 months ago

I don't know if it's really part of this issue, but why not storing the atlas raw data on Zenodo ? I experience a very slow download speed with brainglobe (less than a 1MB/s). Zenodo has a very good infrastructure and allow direct dl links without login, etc. And I have a dl speed of over 50 Mb/s over wifi, so wifi is probably limiting here.

adamltyson commented 1 month ago

It's a different issue, but a good point. We're looking at setting up mirrors to improve download speed, but zenodo does seem like a good option. I'm not sure what happened with GIN, I don't recall it being this slow when we first set it up.