BioDataFuse / pyBiodatafuse

Python package for biodatafuse project.
MIT License
3 stars 8 forks source link

Query size for PubChem #152

Closed tabbassidaloii closed 2 months ago

tabbassidaloii commented 2 months ago

The query size reported in metadata of PubChem should be checked. e.g. when using DMD as input, the total number of Uniprot-TrEMBL ids for the gene is 40 (based on bridgeDb) while in the medatadata we have 654

len(bridgdb_df[bridgdb_df["target.source"]=="Uniprot-TrEMBL"])
# 40
{'datasource': 'PubChem',
 'query': {'size': 654,
  'input_type': 'Uniprot-TrEMBL',
  'time': '0:00:01.057647',
  'date': '2024-08-13 12:32:29',
  'url': 'https://idsm.elixir-czech.cz/sparql/endpoint/idsm'}}
YojanaGadiya commented 2 months ago

Issue fixed in PR #158