compomics / moFF

A modest Feature Finder (moFF) to extract MS1 intensities from Thermo raw file
Apache License 2.0
33 stars 11 forks source link

moff_mbr breaks output file header #16

Closed jgriss closed 7 years ago

jgriss commented 7 years ago

Hi,

I used moff_mbr to match features between runs but the output file headers seem to be messed up. Additionally, the value of the run_test column was changed from false to 0.

Here's an exmaple (these are simply the first two lines, since the MBR function added PSMs the order was apparently changed):

Original file:

peptide prot    mod_peptide     rt      mz      mass    charge  filename        run_test
YLIDHIK P05749;P56628   YLIDHIK 2640.8308       301.1746        900.4997999999999       3.0     mgf/site_86-6A-rep1.mgf false

moff-mbr ("_match") file:

run_test   charge  code_unique     filename        mass    matched mod_peptide     mz      peptide prot    rt
0.0     2.0     YDSTHGR mgf/site_86-6A-rep1.mgf 834.3547        0       YDSTHGR 418.18533       YDSTHGR P00358;P00359;P00360    1.0525
Maux82 commented 7 years ago

Hi,

Did you run moff_mbr.py alone or have you run moff_all.py ?

jgriss commented 7 years ago

Hi,

I ran moff_mbr.py alone.

Thanks for the help!

Maux82 commented 7 years ago

moff_mbr.py is supposed to work only inside moff_all.py when you want to run the entire workflow mbr + apex. Let me know if you have the same problem when you run moff_all.py

If you are interested in just the mbr output , just comment lines 140 up to 213 in moff_all.py and it should save the _match files

jgriss commented 7 years ago

Thanks for the information! I will give it a try.

Maybe you could update the documentation on the README site. For me, this wasn't apparent.

Thanks for the help!

jgriss commented 7 years ago

I have now launched the whole pipeline through moff_all.py. Still, the generated *_match.txt files contain messed up headers. The behavior is exactly the same as when calling moff_mbr.py directly.

Maux82 commented 7 years ago

I have tested, the moff_mbr.py alone and the header of the gerated file are still fine.

The fact that the order other field is not tha same , it is a behaivuour of pandas during the concat and merge., but changhing the values it sounds like a bug.

Can I have a couple of your input file just to reproduce this problem ?

jgriss commented 7 years ago

I have now re-checked all the files: The reason why my scripts broke was not that the order of the columns changed (this is fine) but that a column containing "true" / "false" was changed to "1" / "0".

But I guess this is a pandas thing.

So your code seems to be fine.

Sorry about that!