Closed vastevenson closed 2 years ago
Hi @vastevenson, You're right, I was surprised that lipidr didn't recognize "PC O-xx:yy", since it can handle "PC(O-xx:yy)" with the parenthesis. I'll try to fix that in the future.
For now, you can use Regex (R or Python if that's your preference) as follow:
expt_df[[1]] = sub("^(PC|PE) ([OP])-", "\\1\\2 ", expt_df[[1]])
expt_df[[1]] = sub("^TAG(.*)-FA.*", "TAG \\1", expt_df[[1]]) #TAGS were not parsed correctly as well.
lipidr::annotate_lipids(expt_df[[1]])
Hi @ahmohamed,
Thanks for the code snippet. I can confirm this does resolve the issue. One question I have is what will lipidr do if given multiple TAGs of the same name (like TAG 52:3)? Will it sum all of these values for each sample? Or should I sum these manually?
Thanks again for your help!
-Vincent
Hi @vastevenson,
lipidr will keep duplicates as is through the workflow. You can make them unique by adding suffixes to them:
rowData(d)$Molecule = paste(rowData(d)$Molecule, " (", rownames(d), ")")
If you need to treat them as one entity, you can probably use summarize_transitions
to merge them (taking average or max).
When inputting the data matrix csv, I am getting an error and cannot continue as this message is thrown:
The lipid names are coming from UCLA Core's Mass Spec lab, so I think they're somewhat common.
Here's a link to the annot list with the strings of the unreadable lipid names: lipids_annot_list.csv
If I wrote some python to change the string from 'PE O-16:0/16:0' to 'PE_O- 16:0/16:0', would that allow lipidr to parse the name? How would you recommend I name these lipids so lipidr can successfully parse them?
Thank you so much for developing this awesome tool! I'm really excited to use it.