AstarVienna / ScopeSim

A telescope observation simulator for Python.
GNU General Public License v3.0
13 stars 10 forks source link

Do we still want to store SVO data in the repo? #453

Closed teutoburg closed 3 weeks ago

teutoburg commented 1 month ago

79ef96d added quite a large amount of data to the repo, that originally was retrieved and cached (at least in theory) from SVO.

I'm now generally looking into the whole SVO infrastructure we have in effects.ter_curves_utils, with the ultimate goal of getting rid of one of the last remaining astropy-based downloads (PR coming soon), and I'm now wondering if storing all this data in the repo and actually packaging it with ScopeSim is (still) the best approach. Thoughts @hugobuddel ?

hugobuddel commented 1 month ago

Perhaps we can move the data to ScopeSim_Data and make that a dependency of ScopeSim. (ScopeSim-the-application that is; ScopeSim-the-library might not need it).

But I don't want to remove the data entirely, because we (apparently) cannot trust the SVO to be online. I had to scour old astropy caches to find the data in that commit, and I don't want us to be in that situation again. Also, I want the essential part of our software to work without internet access.

teutoburg commented 1 month ago

But I don't want to remove the data entirely, because we (apparently) cannot trust the SVO to be online. I had to scour old astropy caches to find the data in that commit, and I don't want us to be in that situation again. Also, I want the essential part of our software to work without internet access.

Yeah, I can see that. Though I'm considering renaming the files to .xml (which is what they are anyway) because the way they're named now confuses Windows (e.g. HAWKI.H, Windows thinks it's a C/C++ Header) and also GitHub (e.g. HAWKI.Ks, GitHub says on ScopeSim's repo page we're using KerboScript, which is a bit odd). And then ofc adapt a few lines so ScopeSim still finds those files, but that shouldn't be too much of a hassle.

Also I need to sort out the caching while removing the astropy-based download there. I'd like to avoid saving files to the installed package, wherever that is, and rather use a "neutral" cache location (which is done indeed by the astropy downloads, but just saying...).

Perhaps we can move the data to ScopeSim_Data and make that a dependency of ScopeSim. (ScopeSim-the-application that is; ScopeSim-the-library might not need it).

Eventually something along those lines might be good...

hugobuddel commented 1 month ago

Though I'm considering renaming the files to .xml (which is what they are anyway) because the way they're named now confuses Windows (e.g. HAWKI.H, Windows thinks it's a C/C++ Header) and also GitHub (e.g. HAWKI.Ks, GitHub says on ScopeSim's repo page we're using KerboScript, which is a bit odd). And then ofc adapt a few lines so ScopeSim still finds those files, but that shouldn't be too much of a hassle.

OK renaming the files is fine. I didn't actually pay attention to that.

Also I need to sort out the caching while removing the astropy-based download there. I'd like to avoid saving files to the installed package, wherever that is, and rather use a "neutral" cache location (which is done indeed by the astropy downloads, but just saying...).

I propose to have the same three-layer structure as I did for spextra (IIRC):

Using the ScopeSim_Data directory would be particularly useful to find already cached data, but in current form it might be less useful for storing newly retrieved data, because that would (IIRC) effectively be storing data in the site-packages dir, which is bad.

Perhaps we can move the data to ScopeSim_Data and make that a dependency of ScopeSim. (ScopeSim-the-application that is; ScopeSim-the-library might not need it).

Eventually something along those lines might be good...

We can take steps towards that. Maybe we can have ScopeSim_Data move its data to somewhere else then the site-package directory? Or have two location?

teutoburg commented 1 month ago

I propose to have the same three-layer structure as I did for spextra (IIRC):

  • Use a path explicitly set by the user
  • Use the ScopeSim_Data directory if it is installed
  • Use a default path

That's what skycalc_ipy does now, we could indeed standardize on that, ideally with the slight twist, as you mentioned, that ScopeSim_Data should do something other than modify the site-package directory...

hugobuddel commented 1 month ago

Oh yeah, skycalc_ipy, because we can also not trust ESO :-).

One note: downloading to the ScopeSim_Data directory is also a feature. The nightly job of ScopeSim_Data installs ScopeSim_Data with pip install -e . and then everything is downloaded into the git clone. The job will subsequently create a pull request if there is any new data. So while the behaviour is bad for users, it is also essential :innocent: . But we can manage that.

teutoburg commented 1 month ago

One note: downloading to the ScopeSim_Data directory is also a feature. The nightly job of ScopeSim_Data installs ScopeSim_Data with pip install -e . and then everything is downloaded into the git clone. The job will subsequently create a pull request if there is any new data. So while the behaviour is bad for users, it is also essential 😇 . But we can manage that.

That's fine because ScopeSim_Data is not a PyPI package (yet? idk)...

hugobuddel commented 4 weeks ago

I think we can make ScopeSim_Data a PyPI package. I initially intended it just for internal use, but I have recommended the package to others as well, so making it a PyPI package makes sense to me.