SimpleNumber / aa_stat

AA_stat tool is for searching uncovering the unexpected modifications of amino acid residues in the protein sequences, as well as possible artifacts of data acquisition or processing, in the results of proteome analyses.
Other
6 stars 5 forks source link

Analysis Human Derived Serum Samples #6

Closed BenSamy2020 closed 3 years ago

BenSamy2020 commented 3 years ago

Greetings,

I am trying to analysis human derived serum sample proteomics data (searched using MSFragger OpenSearch). Each file has ~150 proteins and <500 peptides. For your tool to execute successfully it requires >500 peptides. By any chance will I be able to tweak the program to allow analysis of proteomics files containing <500 peptides?

I understand that the analysis will be sub-optimal with lesser peptides being used. But I am fine with the sub-optimal analysis outcome.

Regards, Ben

Capture

levitsky commented 3 years ago

Dear Ben, the warnings that you see and the error are probably not related to each other. Can you please update to the current master once again? I have fixed an issue which you are probably experiencing right now. If that is the case, you will still see the warnings but the program should carry on.

BenSamy2020 commented 3 years ago

Greetings,

I have managed to run the program. But there is still some issues. I have provided the screen-shot of the error below. Also based on your bioRxiv paper, you showed a summary output (figure 2). I am not finding that out in my DIR.

Regards, Ben

CMD_Commands aa_stat_output_files

levitsky commented 3 years ago

Hi Ben,

Sorry about the problem you're having. I have a theory that AA_stat doesn't process your files correctly because of modifications. Processing the results with variable modifications is something we only added recently. Did you specify +42 Da as protein N-terminal modification? If so, AA_stat is probably tripping on that.

I will work on a fix. Hopefully I will be able to reproduce the issue myself, but if you are able to share your files with me, that would possibly speed things up.

In the meantime you can try re-running the search without that modification, and see if everything works. The search results shouldn't be very different.

BenSamy2020 commented 3 years ago

Greetings,

Can you share with me your drop box link. I will send the file over.

Regards, Ben

levitsky commented 3 years ago

Thank you @BenSamy2020, I couldn't easily reproduce the problem on my own, so I'd appreciate if you upload your mzML and pepXML files here: https://www.dropbox.com/request/nPpldDn6ERjyuMf69UAJ MSFragger params would also help. Thank you!

BenSamy2020 commented 3 years ago

Greetings,

I just uploaded the files. Can you please assist me to confirm if you have successfully obtained the required files?

Regards, Ben

levitsky commented 3 years ago

Thank you @BenSamy2020. I got the files you uploaded but I don't see the mzML file. It would help me because the error you showed happens at localization, so I won't be able to reproduce it without mzML.

BenSamy2020 commented 3 years ago

The mzml file is 2.9GB. I am not able to upload it Capture

BenSamy2020 commented 3 years ago

Can I get your email, I might have a way to send it to you.

levitsky commented 3 years ago

Thank you Ben, my email is <...>

BenSamy2020 commented 3 years ago

Greetings,

just sent it. Please confirm if received.

levitsky commented 3 years ago

Yep, all received and reproduced the error. Thank you!

BenSamy2020 commented 3 years ago

Great, looking forward for the updated AA_Stat tool mate!

levitsky commented 3 years ago

@BenSamy2020 you're welcome to try the latest version, it worked for me with your files. The report simply says that +42 is a terminal modification for now, I may tweak this later, but the analysis should work.

BenSamy2020 commented 3 years ago

@levitsky It working. Thanks alot for troubleshooting.

levitsky commented 3 years ago

@BenSamy2020 also note that you have decoys marked with rev_ prefix in your data (default for MSFragger) but AAstat had `DECOY` set as default until I changed it just now. So, your input will not be properly filtered if you don't set the prefix in your settings:

[data]
decoy prefix: rev_

In the latest version, having an incorrect decoy prefix will result in an error.

TLDR: please update AA_stat again, or at least use settings above. With the most recent version, you don't need to change anything.