Currently cannot fetch from hitran.org

arunavabasucom / radis-app

A web app for high-resolution infrared molecular spectra using RADIS

https://radis.app

GNU Lesser General Public License v3.0

11 stars 16 forks source link

Currently cannot fetch from hitran.org #527

Closed erwanp closed 2 years ago

erwanp commented 2 years ago

I'll retry in a few days

suzil commented 2 years ago

Would we reduce the load to HITRAN if there was a persistent data store on the Lambda runtime environment? I think RADIS caches this data right?

erwanp commented 2 years ago

Yes, we could only download once every few months. HITRAN database doesn't change much. Total data is ~2 GB.

Radis-Lab already caches everything, using Radis's "fetch_hitran" : https://github.com/radis/radis-lab/blob/main/databases/download_hitran.py

suzil commented 2 years ago

I might try attaching EFS to give the Lambdas a persistent file system in order to cut down on downloads. I'll have to check on the costs of this.

I'm also checking on possibly moving this to something other than Lambda but Lambda is so much cheaper of an option. The hosting costs of the current setup is ~$1/month.

suzil commented 2 years ago

Possible approaches to this:

Cache the HITRAN data on EFS
Move the data to a database that we can control, preferably a serverless kind that charges based on usage such as Aurora Serverless or DynamoDB On-Demand

EFS is probably simpler since it's just a file system, but I'd really like to keep costs low for whatever approach is used (low meaning less than a few dollars a month). RADIS app isn't used so frequently that if we use a data layer that prices per request, costs tend to be cheap.

suzil commented 2 years ago

Is this still relevant now that we have a persistent server and run the download_hitran.py script? @erwanp https://github.com/suzil/radis-app/blob/main/server/download_hitran.py

erwanp commented 2 years ago

Not an issue anymore !

erwanp commented 2 years ago

Btw @suzil , can users download from the persistent server ?

I.e., it could be a very easy way to directly get the Radis-formatted HDF5 files for Hitran, or Hitemp, and save a lot of the parsing time I mentioned few weeks ago

suzil commented 2 years ago

There's no reason we couldn't serve users data that is stored on the server.