MassBank / MassBank-data

Official repository of open data MassBank records
74 stars 59 forks source link

Update NaToxAq #50

Closed meier-rene closed 5 years ago

schymane commented 5 years ago

InChIKeys are a very specific pattern with the 3 blocks of fixed character numbers separated by the dashes – hopefully it’s easy to flag because this is not a format that “normal” names would ever take, especially the last block being only one character (usually the dashes come at the beginning and precede a longer text-based name – and the dashes often separate numbers from letters in names). I’d do a specific character and dash count and if it exactly fits an InChIKey pattern, flag the name for removal. It should, theoretically, also match with the InChIKey entry and if not – well that’s an interesting flag that something went wrong somewhere ;-)