Make use of spectra and properties API endpoints

NREL / htem-api-examples

Examples of usage of the HTEM DB API

http://htem.nrel.gov

Other

7 stars 5 forks source link

Make use of spectra and properties API endpoints #3

Closed somerandomsequence closed 6 years ago

somerandomsequence commented 6 years ago

The current code relies heavily on querying individual samples properties. Instead we can make use of the more compact API functions.

For instance:

https://htem-api.nrel.gov/#api-Samples-GetSamplesSpectra - This endpoint can be used to get the entire spectra data for one or more samples in a single query.
https://htem-api.nrel.gov/#api-Samples-GetSamplesProp - This endpoint can be used to get the entire properties data for one or more samples in a single query.

meschw04 commented 6 years ago

Hey Caleb. I checked out the link for this. Wow, that is super handy, I was doing stuff the hard way! I'm a little confused on a couple of the extensions though. Suppose I want to have my function return the "spectra" for sample 6880. According to the link above, I would use a urllib query for "https://api.hpc.nrel.gov/xrd/api/samples/6880/spectra" and it would have just the optical and xrd spectra for the sample. However when I do that, I get a 404 Error. Am I searching it correctly? It should have the same formatting as the "xrf" queries, which would be formatted as "https://api.hpc.nrel.gov/xrd/api/samples/6880/xrf" (this works just fine), but it doesn't appear to be working for either the /spectra or the /prop extensions.

somerandomsequence commented 6 years ago

Hi Marcus,

The format for these endpoints is a little nonstandard to support querying information for multiple libraries:

https://api.hpc.nrel.gov/xrd/api/samples/spectra?ids=6880

and, e.g., for multiple libraries all at once:

https://api.hpc.nrel.gov/xrd/api/samples/spectra?ids=6880,6881

meschw04 commented 6 years ago

The lib files are now fully functional, but since they were previously using the private API, there is some slight tweaking needed to tell these to query the spectra instead of looping over all positions in a sample. Also, in testing out the public API, I found several instances where a sample was found on the website that didn't appear to be available in the API. This seemed a little backwards, any idea why that might be happening? Thanks, Marcus

somerandomsequence commented 6 years ago

Can you provide examples of samples that were in the public API but not the private API (and steps to reproduce)?

somerandomsequence commented 6 years ago

@meschw04 - in the current code, I'm not sure why the result from this:

t.spectra(which='optical')

looks different from:

t.spectra(which='xrd')

I'd expect them to have a similar shape whereas the former is a single row containing lists and the latter is a proper data frame.

meschw04 commented 6 years ago

So here's the issue. With XRD, there are always 661 data points for wavelength, 661 for intensity, and 661 for background. This made it readily compatible to look like a classical pandas DataFrame. With optical, it's a different story. This is because near-infrared takes around 400 measurements while ultraviolet takes around 800 measurements. When I tried to put them all in the same pandas DataFrame, it complained that all of the columns wouldn't be the same size. My options were to either pad these shorter lists with nulls so that all the columns were the same length or to include them in a pandas DataFrame as a list under a single row. Is there a better way to do this? Thanks!

somerandomsequence commented 6 years ago

Hrm, okay, this seems reasonable. Thanks for clarifying.

I don't really have a strong preference between the 1-row-containing-lists and the padded data frame. From a user perspective it's a bit surprising to get a different output "shape" depending on the input parameters from a function. On the other hand, it's also a bit surprising to have padded/null data. I'd probably err towards the padding.

meschw04 commented 6 years ago

I decided to pad with nulls using pd.concat([],axis=1). I also adjusted basic importing notebook and optical notebook to reflect this change.