Closed Favourj-bit closed 5 hours ago
@cannin @dexterpratt
I tried to extract the xml of some pmids listed in the text document of the acc zipped file mentioned in #2 and I keep on getting the error attached in the screenshot
@cannin @dexterpratt
I already tried the paper gotten from PMC333362, PMID: 13086 which was gotten from the txt pmids file. I extracted the text using the read_pdf function I started out with. I already have some results which I have pushed to the repo. I also tried out gpt-4, gpt-4-turbo and gpt-4o for the paper and I have taken notes of the time differences for running the code using each of these models.
I noticed that with gpt-4, the model hallucinated and added the examples I showed it in the prompt even when it wasn't part of the paper, this made me to refine my prompt to specifically tell it to only use those examples to see how to structure the results.
Access the INDRA statements for a specific publication:
from indra.sources import indra_db_rest
ip = indra_db_rest.get_statements_for_papers([('pmid', '27153756')])
curl -X POST https://db.indra.bio/query/statements \
-H "Content-Type: application/json" \
-d '{"query": {"class": "FromPapers", "constraint": {"paper_list": [["pmid", "27153756"]]}, "inverted": false}}'
This a Pipfile for use with pipenv
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true
[dev-packages]
ipython = "*"
[packages]
requests = "*"
tqdm = "*"
jsonpath-ng = "*"
pyjnius = ">=1.3.0"
indra = {git = "https://github.com/sorgerlab/indra.git", editable = true, ref = "8919f134bbcdb08bd0dc288fe8b6b79a4f6acc94"}
[requires]
python_version = "3.10"
Hi @cannin @dexterpratt
I was able to install and configure indra. Then, i tried to get the indra statements for the paper with pmcid: PMC333362.
However, I seem to be getting very few statements as compared to when I use my gpt extraction code and this is confusing me.
this link contains the results: https://github.com/ndexbio/gsoc_llm/blob/main/results.json
and this is the code used: https://github.com/ndexbio/gsoc_llm/blob/main/python_scripts/get_indra_statements.py
@cannin
Please for the paper I am supposed to work with, where would i get the indra result in order to compare with?
Also, please when searching, i used pmid in front of the number because without doing this, the number just directs me to a gene on the nih website. I wanted to confirm if I am searching correctly. I got the result below