Open Rekyt opened 1 year ago
@Rekyt @bmaitner As with #14 and #15, the source of this issue (and any potential fix) is almost certainly the perl controller. I really wanted to avoid messing with the controller, but re-assigning IDs is a serious issue. I'll take a look.
Thank you again for your quick answers @ojalaquellueva! And good luck with the controller...
@Rekyt I am still scoping this out. Assuming I can isolate and replicate the issue within the perl controller, I will transfer the issue to https://github.com/ojalaquellueva/TNRSbatch. In the meantime, you can bypass issues #14, #15 and #16 by processing your names as follows:
@bmaitner See my recommendation above. This is how I have always pre-processed names, and explains why I never noticed the issues spotted by @Rekyt. For now, I suggest that both of us add these pre-processing recommendations to our documentation.
@Rekyt I am still scoping this out. Assuming I can isolate and replicate the issue within the perl controller, I will transfer the issue to https://github.com/ojalaquellueva/TNRSbatch. In the meantime, you can bypass issues #14, #15 and #16 by processing your names as follows:
1. Exclude any names which are all whitespace, NULL, NA, or empty string 2. Do a unique on the remaining names, extracting them to the new data frame and assigning them a new integer ID (e.g., "unique.name.ID") 3. Submit the pre-processed names + unique.name.IDs to the TNRS 4. After processing you can transfer the result back to the original data frame by joining on the (now unique) Name_submitted in the TNRS results.
Thanks for the very detailed process! Will follow right away :)
Added a note to the readme. Thanks for catching this, @Rekyt
I also had a strange bug when having the empty string as input (but not the same as #14).
Sometimes, if the dataset is "big enough", testing with a data.frame with an empty string actually removes from the input and moves the next name to that ID instead. This causes many issues if you want to match back the names through the ID.
See my reproducible example:
Created on 2023-02-14 with reprex v2.0.2
I dived into the issue by looking at the query done through
TNRS_core()
and it seems all fine. The data JSON looks like this:So perfectly fine. But the API returns the same table as above with the names moved up. So it seems to be rather an issue with the API.