ProteoWizard / pwiz

The ProteoWizard Library is a set of software libraries and tools for rapid development of mass spectrometry and proteomic data analysis software.
http://proteowizard.sourceforge.net/
Apache License 2.0
210 stars 97 forks source link

Fix hang in ProteinMetadataManager #3030

Closed nickshulman closed 2 weeks ago

nickshulman commented 1 month ago

Fixed potential hang extracting chromatograms when some protein groups have proteins with nonstandard accession numbers (reported by Mike)

The problem is that when some protein groups have invented accession numbers, a partially-searched ProteinGroupMetadata might be put into ProteinMetadataManager._processedNodes. When that happens, then incompletely searched protein groups in the document never get searched to completion, and ProteinMetadataManager keeps saying it has more work to do but just calls the very expensive method "SrmDocument.ChangeChildrenChecked" with nothing new.

This change makes it so it checks whether the thing in "_processedNodes" is only partially completed, and, if so, makes sure to queue up requests for the unsearched parts of the group.

(This diff is more simple if you hide whitespace changes)

chambm commented 3 weeks ago

IMO it should be an error to have multiple proteins with the same name in the same document, much less in the same protein group. How the heck could users tell what node a name was referring to if it's not unique? RenameProteinsDlg.UseFastaFile() for example will throw an error if multiple proteins have the same name. I don't know how strictly we enforce that in other places that proteins are imported though.

bspratt commented 3 weeks ago

IMO it should be an error to have multiple proteins with the same name in the same document, much less in the same protein group.

Same name in same group seems bad, yes, but is there no use case for grouping proteins in various ways?

nickshulman commented 2 weeks ago

/rebase