materials-data-facility / data-schemas

Apache License 2.0
6 stars 2 forks source link

Add spectrographic analysis schema #21

Closed BenGalewsky closed 5 years ago

BenGalewsky commented 5 years ago

Problem

Researchers who produce automated analysis of their spectrographs need to be able to search for records based on numeric data uncovered by the analysis. There is currently no data block to hold this.

Approach

Created a new schema for raman_analysis - it records a list of peaks and a list of ratios of the areas between peaks.

Example Data Block

{
"wavelength": 3.14,
"peaks":[
  {
    "label": "d",
    "width": 100,
    "center": 55
  },
  {
    "label": "g",
    "width": 101,
    "center": 55
  }, 
  {
    "label": "g'",
    "width": 102,
    "center": 55
  },    
],
"ratios":[
  {
   "peak_1": "d",
   "peak_2": "g",
    "ratio": 42.4
  },
  {
   "peak_1": "d",
   "peak_2": "g'",
    "ratio": 42.4
  },
  {
   "peak_1": "g",
   "peak_2": "g'",
    "ratio": 42.4
  },  
]
}
jgaff commented 5 years ago

Awesome, thanks Ben. To keep all of our context in one place, I'll re-ask my questions here.

1) For peaks.label: Are we enforcing a type of label here? In the first draft this was "scientific label" which I didn't really understand, and this seems even more vague. Is there a standard convention for naming peaks that we can have users follow?

2) For ratio.peak-pair: Lables of two peaks separated by a dash - this seems a little cumbersome, and dashes don't do well in Elasticsearch. Again, I'm wondering about these labels, and I'm wondering if there's a better way to express this information. Perhaps peak_1 and peak_2?

blaiszik commented 5 years ago

Since this is already in the spectrographic_analysis block maybe we call spectrographic_technique technique

We should ask Elif and the NanoMFG collaborators to comment on how specific we need to be on peak labels. Do d and g and g' have more descriptive names for example? From quick search it appears not.

BenGalewsky commented 5 years ago

The labels for the peaks will have meaning for a specific material and its community. It seems like the labels are meaningless unless you know what the material is. We should make sure for the Graphene project that we are populating Add a new common-name property to the material block.

BenGalewsky commented 5 years ago

Pushed a new commit with these review items addressed

jgaff commented 5 years ago

Great, I think that addresses what we talked about. I do see two more nits to pick: 1) common-name should be common_name in the material block 2) peaks.width still has "Width of peak in *** units?" as the description - we should figure out the units here.

Item 2 might be solved when we ask Elif and the NanoMFG team to review.

blaiszik commented 5 years ago

The unit point here is particularly important. The unit will change depending on the technique, so we may have to specify it in the block explicitly. Ratios in this case are dimensionless.

BenGalewsky commented 5 years ago

As per our last conference call and from feedback on this schema, we decided to avoid problems with different units for different spectrographic techniques by creating a separate schema for each technique. That way the units are consistent for every entry of this type.