Closed ireneisdoomed closed 2 years ago
Probes have been regenerated based on the data in the 01.2022 SQL dump. Apart from the changes mentioned above, we are also bringing 2 more probes sets based on their list
However data are not uploaded yet because there is a parsing issue. This a breakdown of the representation of the probes per data source: | 22.02 | 22.04 | |
---|---|---|---|
Open Science Probes | 112 | 130 | |
Probe Miner | 3256 | 3255 | |
Probes & Drugs Portal | 93 | 0 | |
High-quality chemical probes | 0 | 940 | |
opnMe Portal | 73 | 87 | |
Chemical Probes.org (legacy) | 602 | 0 | |
Chemical Probes.org | 0 | 748 | |
Bromodomains chemical toolbox | 57 | 55 | |
Gray Laboratory Probes | 88 | 134 | |
Protein methyltransferases chemical toolbox | 28 | 28 | |
Nature Chemical Biology Probes | 51 | 51 | |
SGC Probes | 162 | 170 | |
JUMP-Target 2 Compound Set | 0 | 72 | |
JUMP-Target 1 Compound Set | 0 | 72 | |
Natural product-based probes and drugs | 0 | 19 | |
Chemical Probes for Understudied Kinases | 0 | 41 |
“High Quality probes” shouldn’t be considered a source, as these are reported with the flag isQuality
. Also “Probes & Drugs Portal” is not represented due to the same reason: these are probes in the high quality set but which are not present in other sources.
I’ll get back to this bug right after Easter.
WIP notebook can be found at: https://github.com/opentargets/evidence_datasource_parsers/blob/il-22.04/exploration/chemicalProbes/chemical_probes.ipynb
The module has been written and we've been able to generate a dataset based on the P&D newest version without the above mentioned parsing bug.
However there is a bug in the data itself as reported by the ChemicalProbes.org people, which hasn't been solved yet.
I will open the PR with the code as it is now, and we can tackle the data bug in a follow-up ticket.
P&Ds has made a new release and we should therefore rerun the current chemical probes pipeline.
Changes
Based on their release notes: