MannLabs / alphapept

A modular, python-based framework for mass spectrometry. Powered by nbdev.
https://mannlabs.github.io/alphapept/
Apache License 2.0
167 stars 29 forks source link

Enable specification of matching range #255

Open JuliaS92 opened 3 years ago

JuliaS92 commented 3 years ago

Is your feature request related to a problem? Please describe. With biological samples we want to avoid matching identifications between files, where we might genuinely not have the protein in one of the samples (e.g. KO samples, IPs, fractionation, ...), but still be able to transfer IDs between exact biological replicates to boost the numbers. The same goes for peptide fractions, where you would not want to e.g. match between the 1st and 3rd SDB-RPS fraction, but only 1<->2 and 2<->3. Currently it's all or nothing.

Describe the solution you'd like I personally like the solution in MQ where you only match neighboring fraction numbers very much. It could become even better if you offer two matching 'dimensions' so you can e.g. match same peptide fractions of neighboring biological samples, as well as neighboring peptide fractions of same biological samples. In the example below where it would be biological sample x peptide fraction and sample *2x2* would receive IDs from _4 other raw files_.

-------------------
| 1x1 |_1x2_| 1x3 |
-------------------
|_2x1_|*2x2*|_2x3_|
-------------------
| 3x1 |_3x2_| 3x3 |
-------------------
straussmaximilian commented 3 years ago

Hi, Unfortunately, streamlit has no native editable table option to conveniently select such things as Fractions and Matching groups. I now found a workaround for an editable table (Screenshot attached) that could potentially work.

Screenshot 2021-07-04 at 23 45 05

The idea would be to have a multiselect above the table to exclude runs and then manually enter data for each file. There is also now the column Shortname that has the filename w/o extension and will be checked for duplicates - so we could use this for a cleaner protein group column name.

For automated annotation, I thought about including a regex function that would automatically fill the cells based on the filename.

@ammarcsj @JuliaS92 What do you think about this? More specifically:

There are a couple of limitations with this table layout (e.g., can't select multiple cells and change them at once), but this could be a start.

ammarcsj commented 3 years ago

Hi Max, this looks cool, just some questions/thoughts:

JuliaS92 commented 3 years ago

Hi @straussmaximilian @ammarcsj,

ibludau commented 3 years ago

Not sure if this is already integrated, but an option to just upload a design table could be a good addition/alternative. Some people might simply be more comfortable creating the table beforehand.