Rappsilber-Laboratory / build-xiview

GNU General Public License v3.0
2 stars 0 forks source link

Empty view if mzIdentML file contains protein sequences (public website) #85

Closed vrkosk closed 2 years ago

vrkosk commented 2 years ago

Upload an mzIdentML 1.2 file where protein sequences are encoded like:

    <DBSequence id="DBSeq_1_LYSC_CHICK" searchDatabase_ref="SDB_SwissProt" accession="LYSC_CHICK" >
      <Seq>MRSLLILVLCFLPLAALGKVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQATNRNTDGSTDYGILQINSRWWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMNAWVAWRNRCKGTDVQAWIRGCRL</Seq>
      <cvParam accession="MS:1001088" name="protein description" cvRef="PSI-MS" value="Lysozyme C OS=Gallus gallus OX=9031 GN=LYZ PE=1 SV=1" />
    </DBSequence>

xiVIEW doesn't report any warnings or errors on upload. However, no proteins are shown and it seems all peptide matches are thresholded out of existence (e.g. "0 out of 8").

If I delete the elements and re-upload. proteins are shown fine.

colin-combe commented 2 years ago

Hi Ville,

are you able to share the whole mzIdentML file? (something is clearly going wrong)

cheers, Colin

colin-combe commented 2 years ago

(just tested other mzid files do still work with Seq included)

vrkosk commented 2 years ago

I'm wrong, it's actually caused by the presence of sameset proteins, not protein sequences. Attached are mzIdentML exports of the example disulfide search from a Mascot development version. Each file validates fine against the schema, and the mzIdentML validator version 1.4.35 reports no errors.

F002553.mzid has default settings (no samesets, no protein sequences) and works OK.

F002553_seqs.mzid has protein sequences and works OK.

F002553_samesets.mzid has sameset proteins and xiVIEW shows no proteins at all.

F002553_samesets_and_seqs.mzid has sameset proteins and protein sequences and xiVIEW shows no proteins at all (this is the file I initially used).

F002553_seqs.zip F002553_samesets.zip F002553_samesets_and_seqs.zip F002553.zip

colin-combe commented 2 years ago

It might be that things are working OK, but you're encountering a confusing bit of xiVIEW behaviour.

People asked to have the self links unselected by default for large searches (>50 proteins). In retrospect, this was a bad idea and we're changing it back.

I think the 'samesets' mzId files do have more than 50 proteins, even though only one of them is crosslinked; so that "Self" checkbox in the bottom filter bar is unchecked? If you check it your 8 cross-links should come back?

The linear identifications (uncrosslinked identifications) aren't displayed on the network visualisation page (it would be good to have them there as an indication of sequence coverage). There is another way to look at the linears (the easily missed "Spectra Only" link on the history page), but that doesn't seem to be working correctly. I'll need to look into it...

vrkosk commented 2 years ago

"Self" was indeed unchecked and checking it displays the crosslinks. I agree it's confusing when only one protein in the file has intralinks and there are no interlinks.

colin-combe commented 2 years ago

yes, its super confusing (@grandrea)

colin-combe commented 2 years ago

@vrkosk @grandrea - that particular piece of confusing behaviour should be changed now (self links now always selected by default)

vrkosk commented 2 years ago

Seems to be fixed. I uploaded the same file with sameset proteins included. The file has one intralinked protein and 87 proteins with no links, and now intralinks are shown by default after uploading the file.