sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
177 stars 81 forks source link

How are features obtained if you don’t specify any class information for grouping? #678

Open Pembs opened 1 year ago

Pembs commented 1 year ago

How are features selected when grouping and filling if you don’t specify any class information for grouping? I am curious as to how features are obtained through the older xcmsSet workflow, without sample group/class information, (when specifying sample groups was not mandatory for feature grouping for this older workflow). All parameters were kept the same for each of the workflows.

1-xcmsSet workflow, no class column in phenoData = 5649 features (left out class column by mistake, same number of samples as below) 2-XCMSnExp workflow, sample_groups were 300 samples or 76 QCs =999 features 3-XCMSnExp workflow, considering samples and QCs all in the same group = 712 features

I can understand why there were slightly more features when considering the samples vs QC grouping, as there are smaller sample group numbers to consider when the minFraction is 0.5. But curious as to how features are selected when there is no sample class information. It seems minFraction is not taken into consideration when there is no sample class information – so is this how they are obtained, with a minFraction = 0 with no class information?

jorainer commented 1 year ago

The correspondence method for xcmsSet silently uses sampclass(object) with object being the xcmsSet. I assume the xcmsSet is putting all samples into one group if its not specified - but there is also an automatic sample group estimation based on the folders the provided files are stored in.