BrownDwarf / gollum

A microservice for programmatic access to precomputed synthetic spectral model grids in astronomy
https://gollum-astro.readthedocs.io/
MIT License
21 stars 5 forks source link

Handle Jagged or "Ragged" 3D/4D grid arrays in PHOENIX #33

Closed Tusay closed 2 years ago

Tusay commented 2 years ago

I've just completed download of the Phoenix Grid files. 162G in 1d 13h!

I tried to run through the tutorial for simulating a spectra with Phoenix instead of Sonora: https://gollum-astro.readthedocs.io/en/latest/tutorials/gollum_demo_Sonora_and_BDSS.html

When trying to create the grid with: grid = PHOENIXGrid(wl_lo=wl_lo, wl_hi=wl_hi) it got to about 73% and then stopped with the following traceback: "AssertionError: Double check that the file D:\PHOENIX\phoenix.astro.physik.uni-goettingen.de\HiResFITS\PHOENIX-ACES-AGSS-COND-2011/Z-4.0/lte07800-3.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits exists"

I checked my files and it is indeed missing. ~/Z-4.0/lte07800-2.50 and ~/Z-4.0/lte07800-3.50 are both there, but not 3.00

Looking back in my terminal log, it appears as though the download didn't even look for ~/Z-4.0/lte07800-3.00* it just went straight from 2.5 to 3.5.

Is the file missing? Is there a way to get just that file to complete the grid? Is there a way to turn ignore that missing grid point? I'm not sure what I should do. Please advise.

gully commented 2 years ago

Fascinating! Thank you for opening this issue! I think for the very low metallicities (-4 metallicity is ultra metal poor) the PHOENIX grid itself may not have computed all logg values, so indeed this is a bug: gollum should know about the underlying presence or absence of all the PHOENIX grid points, and it currently does not. By deafult, the code snippet as you wrote it attempts to read in the entire grid:

grid = PHOENIXGrid(wl_lo=wl_lo, wl_hi=wl_hi)

I recommend delimiting the grid to a subset--- a limited range in temperature, a limited range in surface gravity, and a limited range in metallicity in the following way:

grid = PHOENIXGrid(wl_lo=wl_lo, wl_hi=wl_hi, teff_range=[2800, 3300], 
                          logg_range=[3.0, 5.0], 
                          metallicity_range=[-0.5, 0.5])

A side benefit is that reading in this smaller subset of the grid will take less time.

Your missing grid point makes me realize that my code may have missed the possibility of jagged grid arrays, so you thank you again for raising this friction point to my attention! 🙏

gully commented 2 years ago

Let me know if this workaround fixes your problem @Tusay ! 😄

Tusay commented 2 years ago

Ah! specifying the ranges of temp, logg and metallicity makes so much sense. I didn't think of that because I didn't realize those were adjustable parameters just from following the tutorial. But of course I don't need the entire grid! (it makes so much sense now that I understand better how it works, haha)

I had found several more missing files btw, though mostly confined to the extremes of the entire grid. For reference, the missing grid points are: ~/Z-4.0/lte07800-3.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08000-3.50-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08200-3.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08200-3.50-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08200-4.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08400-3.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08400-3.50-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08400-4.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08600-3.50-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08600-4.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08800-3.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte08800-3.50-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte09000-3.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte09000-3.50-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte09000-4.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte09200-3.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte09200-3.50-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte09200-4.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte09400-3.50-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z-4.0/lte09400-4.00-4.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte09600-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte09800-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte10000-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte10200-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte10400-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte10600-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte10800-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte11000-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte11200-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte11400-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte11600-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte11600-2.50+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte11800-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte11800-2.50+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte12000-2.00+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte12000-2.50+1.0.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+0.5/lte11000-2.00+0.5.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+0.5/lte11200-2.00+0.5.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+0.5/lte11400-2.00+0.5.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+0.5/lte11600-2.00+0.5.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+0.5/lte11800-2.00+0.5.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+0.5/lte12000-2.00+0.5.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+0.5/lte12000-5.50+0.5.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits ~/Z+1.0/lte12000-5.50+0.5.PHOENIX-ACES-AGSS-COND-2011-HiRes.fits

I agree that the grid should account for missing files or jagged arrays as you put it. Also, I recommend noting in the tutorial that loading in the entire grid is excessive, slow and eats up too much memory. After trying to apply a temporary patch by simply copying adjacent arrays and renaming them (in order to get a full set so it would stop complaining) my computer ran out of resources at 100% completion and couldn't render the dashboard.

Limiting the parameter ranges as suggested allowed it to load exponentially faster and reduced the overhead so I could load multiple instances in separate cells with no memory issues. It works great now! Thank you!

gully commented 2 years ago

The PHOENIX paper section 2.1 and Table 1 show the grid sampling:

image
Sujay-Shankar commented 2 years ago

Closed by #59 Thank you @Tusay for raising this issue! Let us know if anything else in the same vein arises.