Pyteomics is a collection of lightweight and handy tools for Python that help to handle various sorts of proteomics data. Pyteomics provides a growing set of modules to facilitate the most common tasks in proteomics data analysis.
This PR changes the UniRef header parser in a way similar to what #93 did for Uniprot, only recognizing the keys described in the UniProt specification for UniRef.
In addition to changing the pattern, it removes extra keys and values that were produced by splitting the values of UniqueIdentifier and RepresentativeMember on _. The latter was resulting in errors on the Uniref database downloaded from uniprot.org, so apparently the parser wasn't getting much use, and the change won't affect anyone.
However, some alternatives can still be discussed, like providing extra keys only when splitting is successful. This would result in errors in user code when a key is suddenly missing, as opposed to these keys consistently absent in the output with the currently proposed change.
This PR changes the UniRef header parser in a way similar to what #93 did for Uniprot, only recognizing the keys described in the UniProt specification for UniRef.
In addition to changing the pattern, it removes extra keys and values that were produced by splitting the values of UniqueIdentifier and RepresentativeMember on
_
. The latter was resulting in errors on the Uniref database downloaded from uniprot.org, so apparently the parser wasn't getting much use, and the change won't affect anyone.However, some alternatives can still be discussed, like providing extra keys only when splitting is successful. This would result in errors in user code when a key is suddenly missing, as opposed to these keys consistently absent in the output with the currently proposed change.