Closed keflavich closed 7 years ago
As a step along the way, and possibly the only one I'm interested in implementing, I'd like to be able to parse CDMS results into astropy tables. Here is an example query:
import requests
import bs4
url = 'http://cdms.ph1.uni-koeln.de/cdms/tap/'
rslt = requests.post(url+"/sync", data={'REQUEST':"doQuery", 'LANG': 'VSS2', 'FORMAT':'XSAMS', 'QUERY':"SELECT SPECIES WHERE MoleculeStoichiometricFormula='CH2O'"})
bb = bs4.BeautifulSoup(rslt.content, 'html5lib')
h = [x for x in bb.findAll('molecule') if x.ordinarystructuralformula.value.text=='H2CO'][0]
tem_, Q_ = h.partitionfunction.findAll('datalist')
tem = [float(x) for x in tem_.text.split()]
Q = [float(x) for x in Q_.text.split()]
So the first priority is implementing a CDMS table parser. @vilhelmp, I think you might also be interested in this?
Yes, this would be nice indeed.
After working a bit with Holger Muller (guy behind http://www.astro.uni-koeln.de/cdms), I realize that it might also be good to have an interface to the "normal" cgi-bin POST interface. If they update any files in the database, it is through the web interface (i.e. http://www.astro.uni-koeln.de/cgi-bin/cdmssearch) which all the updates are accessible first. The VAMDC comes later, they have to do some manual updating for that to happen.
Search: I've been trying to figure out the relevant POST request (using Live HTTP headers Chrome plugin).
Result tables: The results are in fixed-width tables with the same format as the JPL molecular line catalog (http://spec.jpl.nasa.gov/ftp/pub/catalog/README) where the format is given as a Fortran (fixed width) format specifier. For reading the tables, the obvious go to one would be Astropy tables with format='fixed_width_no_header' (see http://stackoverflow.com/questions/35018200/reading-table-data-card-images-with-format-specifier-given-into-python?noredirect=1#comment57809951_35018200). (an alternative is the old package FortranFormat, but adding another required package...) It could be good idea to write a short translation tool that would take a Fortran format specifier e.g. "(F13.4,F8.4, F8.4, I2,F10.4, I3, I7, I4, 6I2, 6I2)" and translate that into Astropy Table fixed width reader "col_starts" and "dtype" input.
Anyway, just some thoughts on this.
:+1: @vilhelmp, this is the best approach for astroquery, at least until we integrate vamdclib
into astroquery (which I hope we can eventually do).
The shortest way I figured out to get the CDMS text results into Astropy Table format is the following:
from astropy.table import Table
import astropy.constants as c
import astropy.units as u
cdms_colnames = ('FREQ', 'ERR', 'LGINT', 'DR', 'ELO', 'GUP', 'TAG', 'QNFMT', 'QN1', 'QN2', 'SPECIES')
cdms_colstarts = (0, 13, 24, 35, 37, 47, 50, 57, 61, 72, 89 )
lines = Table.read('cdms_table_file.tab',
format='ascii.fixed_width_no_header',
names=cdms_colnames,
col_starts=cdms_colstarts,
)
and then proceed to parse the units
lines['FREQ'] = lines['FREQ'] * 1e-3
lines['FREQ'].unit = u.GHz
lines['ERR'] = lines['ERR'] * 1e-3
lines['ERR'].unit = u.GHz
lines['ELO'].unit = u.cm**-1
lines['ELO'] = (lines['ELO'].quantity*c.c*c.h/c.k_B).decompose()
lines['ELO'].unit = u.K
# calculate the E_up (in Kelvin) from the E_low (in Kelvin)
lines['EUP'] = lines['ELO'] + ((c.h * lines['FREQ'].quantity)/c.k_B).decompose()
Here I have to keep track of the units a bit more than usual, is there a way to get it to just use the unit that is calculated? (i.e. lines['ELO'] = (lines['ELO'].quantity*c.c*c.h/c.k_B).decompose()
will just give it units of Kelvin
)?
Closing this one as an experimental VAMDC module has beed added in #658. Please feel free to reopen if you think otherwise.
Probably only a limited variant to match the splatalogue query tool: http://portal.vamdc.eu/