Discrepancies in summary data

The summary data in the last release (https://github.com/openkinome/kinodata/releases/tag/v0.2) differs from what reported by the notebooks and in the data files that are outputted by them:

Dataset	Non-curated	Curated
ChEMBL 27	182 223	148 836
ChEMBL 28	199 238	159 978

vs the number of unique records in the output .csv's and reported in the notebooks:

Dataset	Non-curated	Curated
ChEMBL 27	217 612	174 238
ChEMBL 28	237 336	186 972

The notebooks appear to run fine so I have added the data from the notebooks/output files themselves to the latest release (https://github.com/openkinome/kinodata/releases/tag/v0.3). But having looked at the data directly, I can't establish where the data in the first table comes from.

openkinome / kinodata

Discrepancies in summary data #13