Closed jschrier closed 1 year ago
That's really interesting that some have no name specified! I'll look into the issue and see what's going on with it!
it may also be the case that the "name" dictionary key is incorrect for an entry (analogous to the inchi error we found yesterday)
On Fri, Oct 13, 2023, 07:31 oliviavanden @.***> wrote:
That's really interesting that some have no name specified! I'll look into the issue and see what's going on with it!
— Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_raver8_ML-5Fchemical_issues_12-23issuecomment-2D1761525142&d=DwMCaQ&c=aqMfXOEvEJQh2iQMCb7Wy8l0sPnURkcqADc2guUW8IM&r=TkdkMZKgCpYcE_rS3xubC7pX-Fv1fDBJWWAItU-ijMU&m=SXnQkr9ZYtK5tw36H4jhc9exy-_TgRxOMCqci1L7hjNaPM2YrCqjZbsQZK1P8ARp&s=m50O2OqyUwh36whAXYkI5TiyiyAvKP-469S0ykouLVc&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AB3WW52HYDSOLERAZV56IK3X7E7B5AVCNFSM6AAAAAA56MGFLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRRGUZDKMJUGI&d=DwMCaQ&c=aqMfXOEvEJQh2iQMCb7Wy8l0sPnURkcqADc2guUW8IM&r=TkdkMZKgCpYcE_rS3xubC7pX-Fv1fDBJWWAItU-ijMU&m=SXnQkr9ZYtK5tw36H4jhc9exy-_TgRxOMCqci1L7hjNaPM2YrCqjZbsQZK1P8ARp&s=dFRo51D-aY4sSKUdP65SYkKtYyYob7HvOjDBKlEMRZw&e= . You are receiving this because you authored the thread.Message ID: @.***>
Some names have different SMILES now that they're all combined into one document. For one molecule, it was the same information, but different SMILES. I have to look into this further.
Most of the names were just along with duplicate InChIKeys, and were the same molecule. The only issue I ran into was with 18-crown-6 and potassium acetate 18-crown-6.
As suggested by @oliviavanden , I implemented a check as to whether two different entries have one names-list entry that is the same (as each "name" is potentially an identification for the extraction record, they must be unique otherwise it will be unclear which refers to which)
There are entries in
../chemical_dictionaries/chem_dictionary_records.json
that have no name specified!There are also about a dozen entries in which the same name is used in entries in different dictionary files. But I suspect that these might be resolved automatically if one first resolves the open issue #11 same-inchi-across-two-different-dictionaries