Closed pstahlhofen closed 3 years ago
Hi, can you comment what happens if I have spectrum1 with outliers a,b,c and another spectrum2 with outliers x,y,z ? Will I get one file with "x,y,z,a,b,c" ? I also would recommend a better filename. If one day we have outliers because of intensity (or RT or whatever) then outliers.csv
is not specific enough. Or we'd need to encode the reason for the outlying. What about outliers-scanrange
?
Also if you print WARNING: There were ...
I recommend to put that into a warning()
.
I don't think it's ideal to write to a fixed "outliers.csv" without the user's knowledge. Could we handle this instead with the archivename
parameter, the same way as for _Failpeaks.csv
?
If an archivename is specified, failpeaks are written to archivename_Failpeaks.csv
If no archivename is specified, the failpeaks are not written to disk.
So maybe accept an archivename
parameter and write to archivename_outliers.csv
?
This is still not super-ideal, because findMsMsHR.mass
is a low-level rather than a workspace-level function, and wouldn't usually need to know of such a high-level property. But this has precedent in the cpdID
argument which is also not strictly needed in this function.
Then, consider writing to archivename_cpdID_outliers.csv
?
@sneumann your opinion?
In some older files, I believe mzXML and mzData, probably also mzML that aren't made with ProteoWizard, these two columns are NA
. E.g.,
list.files(system.file("spectra.Glucolesquerellin", package = "RMassBankData"), full.names = T)[2] -> t
f <- openMSfile(t)
h <- header(f)
h$scanWindowLowerLimit
Can you check for that and only do your check if those columns are not NA
?
Okay, sorry for doing my minor adjustments after just reading Steffen's first remark about nrow
and not looking into the other stuff yet. I will go through this once again and take care of all your suggestions. Currently all the outliers would just be stashed into one file. @sneumann Should I rather create seperate files for different spectra? And @meowcat I agree that using the archive name is sensible.
After discussing the matter with @meowcat we decided that the additional csv-file for the outliers is not strictly necessary after all. Hence, I simply returned to printing a warning. When writing the log-wrapper functionality, I can add a message for each parsed compound, so the source of the problem could be detected by the order of messages.
Looks good to me now
All peaks outside the scan range are written to a file called
outliers.csv
Furhtermore, a warning is generated if some of the peaks are outside the range. This addresses #255