wincowgerDEV / OpenSpecy-package

Analyze, Process, Identify, and Share, Raman and (FT)IR Spectra
http://wincowger.com/OpenSpecy-package/
Creative Commons Attribution 4.0 International
23 stars 11 forks source link

[Feature]: add attributes for spectral processing #143

Open wincowgerDEV opened 11 months ago

wincowgerDEV commented 11 months ago

Guidelines

Description

Have been thinking about this for some time. Definitely part of the long term vision but not quite sure when to bring it in. One of the challenges with using spectra is that you can forget what transformations you made to the spectra. For example if you did a derivative transformation and then forgot if it was a first or second derivative you'd be in trouble because you need to identify it using the appropriate library. Another example is if you've transformed the data but forgot how and want to reverse the transformation (sometimes possible sometimes not). Obviously the best way for a user to manage this is to code everything and version control their code so they always have the raw data and the cleaned data, but sometimes things get lost so I totally get it.

Problem

Keeping track of processing steps for spectra is hard.

Proposed Solution

I think we could add attributes that describe how spectra are processed to the OpenSpecy object type. These attributes could have special functions that we hit them with, one could be that the user can provide a suite of attributes to the OpenSpecy object and then it will use those attributes to process the spectra. Another could be to reverse the attributes to get as close to the raw spectra as possible. Another could be to try and merge two OpenSpecy objects using similarities in their attributes. Let me know what you think @zachar

Alternatives Considered

Create separate metadata object types that store common processing operations.

zsteinmetz commented 11 months ago

Yeah, why not! I would maybe keep this simple in the beginning since a well written R script should take care of this anyway.

wincowgerDEV commented 11 months ago

Agreed, maybe slowly start supporting it in the background mostly just to send the user helpful warnings then start to support autocorrection of the issues.

wincowgerDEV commented 10 months ago

Along these lines, also thinking we could control some of the metadata values a little better. We could add checks for metadata variables. W.g. xy need to be distinct, col_id needs to be the same as the spectrum names. Need to have xy and col_id in metadata... I worry though that people may have alternative intended uses for the metadata values that we are currently not considering. Can let that simmer for a while.

wincowgerDEV commented 9 months ago

Sketching this out for things that could be useful to keep track of:

Intensity Unit Options: Absorbance, Transmittance, Reflectance Derivative Order: Deriv_0, Deriv_1, Deriv_2 Baseline Options: Raw, No_Baseline Spectra Type: FTIR, Raman

Looks like to initialize or update we can just add this to the OpenSpecy for each. attr(obj, "intensity_unit") <- NULL

Then we can retrieve the value with attr(obj, "intensity_unit") and run checks using it.

wincowgerDEV commented 9 months ago

Have a first pass of this addition added now https://github.com/wincowgerDEV/OpenSpecy-package/pull/157/commits/74e936c9a0164bdeffeebf219c958a976c1bab52

I think we probably want to add in some functionality to updating attributes when it makes sense. For example, if the user does a derivative transformation we can add the order or if they do a baseline transformation we can add that label. Perhaps these could be used to warn people if they try to do something weird like take the 4th derivative or try to remove the baseline on a derivative transformed spectrum or try and baseline correct something that has already been baseline corrected. There are probably other things we can try to catch too.

zsteinmetz commented 9 months ago

Sounds good to me! I'll try to have a look at PR #157 as soon as possible.