wkumler / squallms

Repository for the Bioconductor squallms R package
Other
3 stars 0 forks source link

Enable Skyline as an input format (trained) #1

Open wkumler opened 6 months ago

wkumler commented 6 months ago

I've long dreamed about putting the manual Skyline integrations we do to work as a training dataset for an untargeted detection algorithm. Seems easy enough to export the peak bounds from Skyline, extract the parameters associated with that data, and use those as a threshold for "Good" molecular features.

One problem is how to obtain a representative sample of "Bad" features - could be done by just selecting random swaths of data from the files but risks grabbing a few good peaks randomly

Benefits from not needing RT correction (Skyline doesn't do this at all iirc)

Could be used to detect things in both PC space as well as med_cor/med_snr space - PC space used to detect "similar" peaks, med_cor/med_snr used to find peaks of similar quality

Don't have a clear idea at the moment about how to implement this directly since transferring the annotation from Skyline to XCMS seems... complicated. No trivial way to link feature IDs to Skyline features. Would it be a problem if the Skyline ones aren't actually features but have the metrics calculated anyway so that some features will be overlapping in representation? Shouldn't be.