Rewrites the RDKit table operations to use subqueries and combined queries. This makes each dataset faster while also making it much easier to rerun the add_datasets.py script without redoing work.
@bdeadman @qai222 With these changes I was able to do a full reload of the database in about 9 hours (instead of 40+ hours before). Running again now that I've updated the queries to avoid temporary tables.
Rewrites the RDKit table operations to use subqueries and combined queries. This makes each dataset faster while also making it much easier to rerun the add_datasets.py script without redoing work.