snijderlab / stitch

Template-based assembly of proteomics short reads for de novo antibody sequencing and repertoire profiling
MIT License
22 stars 3 forks source link

Error in Casanovo file parsing #248

Closed wenjinwu1985 closed 4 months ago

wenjinwu1985 commented 4 months ago

HELP "[Uploading hawk.zip…]()

Error: Not a valid number
╭── D:\stitch-v1.5.0-windows\DETA TEST\hawk.mztab:19809:227
│
19807 │ PSM SYTLLELPPTPSGTTTLSKGNVVT 19746 null null null null "[MS, MS:1003281, Casanovo, 4.1.0]" -0.505144049 null nu …
19808 │ PSM LSADLDELM+15.995DTALREELQNPLEELREEVGTHLDLS 19747 null null null null "[MS, MS:1003281, Casanovo, 4.1.0]" -0 …
19809 │ …=19767 null null null null "0.27434,0.24810,0.24681,0.20568,0.23882,0.23139,0.20678,0.24179,0.22357,0.19806,0.…
· ────────
19810
douweschulte commented 4 months ago

I saw you also listed this below another open issue. Thanks for opening a new one instead, I deleted the comments below the other issue.

For now though I suspect something is off with your mztab file, could you upload it here or send it to me via email?

Note: if you type any formatted text (with important whitespace) between three backticks (`) github will preserve the whitespace before the text and it will show a bit nicer.

wenjinwu1985 commented 4 months ago

"mztab cannot be uploaded because I compressed it."

douweschulte commented 4 months ago

The files you added I can open and there does not seem to be a problem at the indicated lines, but these files also do not look like the errors you gave. Did you upload the same files as you ran through stitch?

wenjinwu1985 commented 4 months ago

Can we communicate via email?

douweschulte commented 4 months ago

Yes we can, but I determined the error. The file you sent contains double quotes around the aa_scores column in the mztab, which Stitch did not expect. There is a fixed version building right now in about an hour you can download that one and see if that fixes your problem.

wenjinwu1985 commented 4 months ago

I can download the latest version over there, right? Thank you

wenjinwu1985 commented 4 months ago

"I don't know where to look or how to use it.

wenjinwu1985 commented 4 months ago

image

douweschulte commented 4 months ago

You will have to download the new binary at the GitHub action page: https://github.com/snijderlab/stitch/actions/runs/8850775149. That version wil work the same as the current version, let me know if it does not solve the problem

wenjinwu1985 commented 4 months ago

ok

wenjinwu1985 commented 4 months ago
image
douweschulte commented 4 months ago

Uploading Peng2021_Herceptin_aspN.zip…

The file was not uploaded fully yet when you sent the message so I am not able to see it.

douweschulte commented 4 months ago

This file seems to be slightly corrupted in the second line. Could you remove the line starting with casanovo sequence -o H: and try again?

douweschulte commented 4 months ago

On my side removing this line and using the most recent version of stitch makes it run.

wenjinwu1985 commented 4 months ago

Can you send me the file that you've processed and can run? I'd like to take a look. Thank you

douweschulte commented 4 months ago

Peng2021_Herceptin_aspN_fixed.zip

douweschulte commented 4 months ago

Deleting that whole line in something like notepad should work. Notepad makes it a bit easier to not leave empty lines and such in the final file.

douweschulte commented 4 months ago

The whole line has to be deleted. This line is not valid according to the specification of mzTab files.

douweschulte commented 4 months ago

Did you download the newest version of stitch from this page? https://github.com/snijderlab/stitch/actions/runs/8850775149

wenjinwu1985 commented 4 months ago

"Yes, I downloaded the file from the homepage. Is the difficulty on my end? Would it be convenient for you to compress and send me your latest file via email?"

douweschulte commented 4 months ago

Peng2021_Herceptin_aspN_fixed.zip

Here is the fixed results file.

douweschulte commented 4 months ago

And here is a direct link to the latest Stitch version: https://github.com/snijderlab/stitch/actions/runs/8850775149/artifacts/1451480417

douweschulte commented 4 months ago

If you run stitch --version it should show the following: image If it does you know you have the latest version.

wenjinwu1985 commented 4 months ago

"Sorry, I don't know how to update the file you provided. I tried using the fixed file you gave me, but it still showed errors. So, it seems the issue is that I haven't updated it. Can you help me with this? Thank you."

wenjinwu1985 commented 4 months ago
image
wenjinwu1985 commented 4 months ago
image
douweschulte commented 4 months ago
image

This one looks good! If you now copy your batchfile inside of this new folder alongside any data that lived in the older stitch folder it will work.

wenjinwu1985 commented 4 months ago
image

"Thank you very much. I have successfully run it. Thank you for your help!"

douweschulte commented 4 months ago

You are welcome.

douweschulte commented 4 months ago

Could you retry uploading the data file?

wenjinwu1985 commented 4 months ago

"Is there a tutorial available for these settings? I'm not sure which ones can affect accuracy."

As for tutorials, I can't say for certain without more context. You might want to check the documentation or resources provided by the software or platform you're using. Often, they include tutorials or guides that explain each setting and its impact on the analysis process. If there's a specific software or tool associated with these settings, looking up its documentation or searching online for tutorials related to it would be a good starting point. If you provide more details about the software or platform, I can try to provide more targeted guidance or suggestions for finding tutorials.

douweschulte commented 4 months ago

Uploading 280 Sample 1denovo - 2.csv…

You have to wait a little longer until the file is actually uploaded before you click send.

douweschulte commented 4 months ago

There is a manual with in depth information on each setting. If that does not help you reading the papers and testing the software by running it a couple of times might be your best bet. If you have questions about specific things in Stitch you can also reach out. But you will have to provide more details for me to be able to help you.

wenjinwu1985 commented 4 months ago

Can you share the manual?"

douweschulte commented 4 months ago

You can find it in the list of releases. For each release the manual is added as pdf.

douweschulte commented 4 months ago

280 Sample 1denovo - 2.csv

Are you sure you ran this with the correct file format specifications. It looks like a PEAKS file so you will have to specify that and specify the correct file format version otherwise you get errors like the ones above. In the manual are all possible versions listed.

douweschulte commented 4 months ago

On the peaks version: version eleven means you need to set 11 in stitch.

wenjinwu1985 commented 4 months ago

On the peaks version: version eleven means you need to set in stitch.11

"I didn't understand what you meant by this sentence."

wenjinwu1985 commented 4 months ago
image
douweschulte commented 4 months ago

Use:

Input ->
    Peaks ->
        Path: ...
        Format: 11
        ....
    <-
<-

But know that the files distributed with Stitch are peaks X+.

I find it hard to help you through English. Do you have any colleagues that have a better grasp of English that could help you understanding what you are doing?