Closed benedekrozemberczki closed 2 years ago
I’d highly request provenance information on how these new datasets were constructed. Were they automatically downloaded from Some external repo? Was processing done to them?
Yes, @cthoyt I will do that in a moment.
There will be a whole Appendix section about this in the paper.
While you’re thinking about it maybe also consider doing the same for the previous two datasets as well :)
@cthoyt How about a dataset preprocessing section in the documentation?
Tbh the only important documentation of data preprocessing to me is code that can exactly reproduce it. Let’s start there and backfill prose-based documentation if there are any places where it can’t be better documented in code itself
Added the cleaning scripts.
It appears you merged the branch with failing tests. This shouldn’t be allowed/possible - the solution is to add some branch protection rules in the settings for the repository
Does it require changes to Github actions?
Nope that looks right!
Summary
Adding DrugBank DDI and TwoSides.
[X] Code passes all tests
[X] Unit tests provided for these changes
[X] Documentation and docstrings added for these changes
Changes