snijderlab / stitch

Template-based assembly of proteomics short reads for de novo antibody sequencing and repertoire profiling
MIT License
22 stars 3 forks source link

How to use raw data #246

Closed biocc closed 6 months ago

biocc commented 6 months ago

Dear Sir,

Thank you for your effort on antibody sequence assembly.

I had some trouble in the use in Stitch1.5. I want to obtain the correct sequece, including identified I and L. So i need input the raw data. I have the mgf file, and how should i input to ensure a match between the de novo result and raw file. Can you provide more complete demo data?

Input -> Peaks -> Path : ../datasets/200305_HER_test_04_DENOVO.csv Format : X+ Name : 01 CutoffALC: 95 RawDataDirectory: R:\F1\peng0013\201912 XleDisambiguation: True

douweschulte commented 6 months ago
Input ->
    Peaks ->
        Path : ../datasets/200305_HER_test_04_DENOVO.csv
        Format : X+
        Name : 01
        CutoffALC: 95
        RawDataDirectory: R:\F1\peng0013\201912
        XleDisambiguation: True
    <-
<-

This batchfile configuration works for raw data, nothing more is needed. This takes all peptides from ../datasets/200305_HER_test_04_DENOVO.csv, in this file the name for the raw data file where the peptides originate from is saved (this is de default for most identified peptide formats). Stitch then looks in R:\F1\peng0013\201912 if it finds the specified files there. So important to note is that the files should NOT be renamed between running them in the peptide sequencing software and the Stitch run.

It indeed is maybe a good idea to create some demo data, but as rawfiles are commonly extremely big I did not do that to this day. I will think of a way to scale one down and create some demo data.

Feel free to close the issue if this answered your question, or ask some more if you have more questions left.