Closed maximskorik closed 2 years ago
@hechth, @xtrojak, if you feel like there's a place for improvements/adjustments you can take it from here. Otherwise, I will be able to react to your suggestions and continue work on the adapter tomorrow afternoon.
Thanks, I will have a look today in the afternoon!
This PR adds functionality to connect
main
branch to themaster
. See #65 for more details.Module_RTclust
column is constructed by concatenating respective columnsMatchCategory
,theoretical.mz
,MonoisotopicMass
, and compound'sName
are obtained after rcx-simple annotation table and joined with the output of isotope matching stepFormula_ID
is created fromcompound
column of rcx-simple annotation table and is unique for each distinct molecular formulaISgroup
values, the values are replaced by "-". The column is not removed to preserve original column orderingMatchCategory
values are the same as in the original tooltheoretical.mz
andMonoisotopicMass
values of annotated isotopes in the original tool are "-". In this version these values areNA
, otherwise the column has to be converted to string data type, which doesn't seem like a good practice.Adduct
column is identical to that in the original toolFormula
values of annotated isotopes do not contain _[+/- num] like they do in the original tool. I suspect this is not used anywhere. The mass number difference is present inAdduct
column and changingFormula
of isotopes would require computing the mass difference induced by adducts.Possible issues: in rcx version of the simple annotation there may be several annotations of compounds with the same chemical formula – our version respects isomers. Because of this there will be compounds with the same
Formula
, but differentchemical_ID
passed to chemical score computation.Example data
Closes #65.