lazear / sage

Proteomics search & quantification so fast that it feels like magic
https://sage-docs.vercel.app
MIT License
210 stars 39 forks source link

Naming inconsistency in output files / documentation #64

Closed RalfG closed 1 year ago

RalfG commented 1 year ago

Hi @lazear,

First and foremost, thanks for the great work!

Continuing the conversation from compomics/psm_utils#31: It seems that internally, the _q suffix is indeed used as is also documented for the result.sage.tsv output: https://github.com/lazear/sage/blob/5f95d454f9b126cf93b2ec96443f4aaa8a88f588/crates/sage-cli/src/output.rs#L62C21-L64

However, the header names use the _fdr suffix: https://github.com/lazear/sage/blob/f55a9e525cf353de94e194428a3a5615d0cbab8c/crates/sage-cli/src/output.rs#L113C18-L115

Personally, I think the _q suffix makes more sense, but for backwards compatibility (which might not be so much of an issue yet), you could opt to keep the _fdr suffix.

lazear commented 1 year ago

I agree, the _q suffix makes more sense (and is correct). I think I initially changed it to fdr to clarify for any users that don't know what q-values are. I will leave this issue open for a week or so in case someone wants to voice opposition to using spectrum_q, etc instead of spectrum_fdr.

radusuciu commented 1 year ago

+1 for _q, I forget where I first came across this but when I see q in this context I immediately think: "adjusted FDR"