Open daithi45 opened 7 months ago
At first, I thought "this is just the nature of CAS numbers" or "this is just how searching for CAS numbers on pubchem works", but in this example, if I search for 613-33-2 on pubchem, I only get one result. It might be worth it to double check how we are querying the pubchem API here, @stitam, and if there is an alternative way that only returns the best match according to pubchem ("best" is currently not an option for the match
argument of get_cid()
)
Interestingly, if I take out the from="cas" element, I only get 1 CID back, will try this on my main dataset and see if it works!
> get_cid("613-33-2", match = "all")
# A tibble: 1 × 2
query cid
<chr> <chr>
1 613-33-2 11941
Hi all, I'm running a dataset of ~1000 CAS#s through webchem to pull CIDs. For about half of them, it pulls multiple CIDs.
Most of the time, the first CID it pulls isn't the correct one and requires manual checking. Is there any way to improve my approach to reduce the manual element?