HUPO-PSI / mzQC

Reporting and exchange format for mass spectrometry quality control data
https://hupo-psi.github.io/mzQC/
Creative Commons Attribution 4.0 International
28 stars 13 forks source link

Add userParam #126

Closed bittremieux closed 3 years ago

bittremieux commented 3 years ago

Provide a userParam type for temporary metrics that don't have a CV term (yet).

mwalzer commented 3 years ago

Right now I'm thinking we should not provide such a mechanism. For one, validation with these would be near impossible, and second, one can certainly already add a custom CV to the file that defines those 'experimental' metrics. Like this, we will very likely get a more complete request to add to the qc-cv. (And we have proper motivation to streamline the CV update process.)

bittremieux commented 3 years ago

This was asked by the reviewers. In our previous call on April 21 we decided that we would strongly discourage people from using this mechanism and instead request proper CV terms instead, but that it can provide a temporary workaround when a relevant CV term is not available.

It is inherent to allowing custom fields that they can't be properly validated. The same problem exists with userParam elements in mzML for example. However, (to my knowledge) it has not led to abuse there. Imo we can throw the reviewers a bone here and include the userParameter without a CV reference. Adding this doesn't change any of our existing files and I don't expect it will be used very often.

julianu commented 3 years ago

I am also in favor of allowing the userParam. It would make it easier for a developer to go ahead and start writing an mzQC exporter without waiting for special metrics to find the way to the CV. As soon as the real term is added to the CV, the changing of the code should not be too difficult. Obviously, we still should encourage to ask for the "real" metric in the CV, but this can go in parallel with writing e.g. an exporter. At least in my experience, the userParams were quite handy when coding an mzIdentML exporter.

tivdnbos commented 3 years ago

I agree with the reasoning from Wout and Julian stated above, and therefore I'm in favor of adding the term as well.

dtabb73 commented 3 years ago

Yes, I think this is the right route. For people who don't have all their terms added to CV will be able to continue prototyping while that process completes. I think discouraging the persistent use of userParam by a tool is appropriate.

cbielow commented 3 years ago

I think we should not abuse a userParam (which in my eyes is extra information which cannot be conveyed using the format's native means) to allow stuffing new 'anonymous' metrics in an mzQC file which:

a) no-one else can interpret or use it (because they do not have a CV term) --> Thus, these files are not suitable for exchange within the community anyways. So if you keep them in-house anyway, you might as well give them arbitrary CV numbers, based on your own CV. b) there is no incentive for the creator to ever change that because in his eyes: 1) it works 2) it validates! 3) "we have generated hundreds of these files already and changing them to a proper CV term would mean we cannot compare them anymore unless we write a conversion script, which I don't have time for". And believe me, this will happen! It happened in OpenMS with non-standard ion-mobility float data-arrays (for which there is a CV term now, but noone bothered to update the code, because 'it works').

If you want to make the userParam standard-compliant by switching to a CV term, you have to not only fix the code (parser/writer), but also change all example/test files, which are lying around somewhere, and its a breaking change (i.e. old mzQC files in the repo where you uploaded some will not get updated) and unless you teach your software to allow for both the new CV terms AND the old userParam (really ugly to maintain btw), you are in a lot of hurt. And every other software which wants to interpret these files, will have to do implement the same fallback. Ouch!

The fact that we discourage this mechanism will not prevent people from using it. And the only effect it will have, is that mzQC files take a harsh beating for not being interchangable and well standardized, i.e. I download an mzQC file from a repo, which passes validation just fine, yet I cannot interpret the information because the interesting metrics (which just got published in a nice paper, but once it was published noone cared about them anymore) are stored as userParam with no CV...

So, if the creator of a new metric is serious about it, he can easily request a new CV term. This is not the show stopper and compared to developing the metric, validating that it is useful and writing a paper about it, the time required to request a CV is negligible.

Allowing this userParam only has negative consequences for everyone. It may look like a good 'hack', but it will lead to headaches for developers and users alike. And it will delay the request for CV terms! (and they are making the format 'rich' in the first place)

I strongly vote for not allowing it. However, if the reviewers can make a better case why this should be useful (and outweigh all the con's I listed) I'm happy to be convinced otherwise.

Just my 50cents.

bittremieux commented 3 years ago

Strong arguments. 🙂 We have decided during today's call not to add a userParameter. Instead, users should request new CV terms, which is an easy enough procedure.