This PR updates and adds NRPS substrates to 430 previously unannotated clusters and updates 769 substrates from existing entries. Evidence codes and publications were added too. Furthermore, clusters with conflicting substrate specificities between original annotations and new annotations were manually curated and changed accordingly.
check_valid.py was run on all .jsons (with the updated schema), no errors were returned. All substrates (322 distinct molecules) now have valid SMILES strings associated with them (validated with PIKAChU). The temporary structure code '[Po][Po]', which was used for missing substrate structures, is no longer found in any .json file.
The json schema was updated to include extra evidence codes, and to change Aspartate and Glutamate to Aspartic acid and Glutamic acid respectively.
This PR updates and adds NRPS substrates to 430 previously unannotated clusters and updates 769 substrates from existing entries. Evidence codes and publications were added too. Furthermore, clusters with conflicting substrate specificities between original annotations and new annotations were manually curated and changed accordingly.
check_valid.py was run on all .jsons (with the updated schema), no errors were returned. All substrates (322 distinct molecules) now have valid SMILES strings associated with them (validated with PIKAChU). The temporary structure code '[Po][Po]', which was used for missing substrate structures, is no longer found in any .json file.
The json schema was updated to include extra evidence codes, and to change Aspartate and Glutamate to Aspartic acid and Glutamic acid respectively.