My indexing code stops if it sees a duplicated article, which was meant to prevent articles getting indexed more than once.
Essentially I was trying to predict the unfortunate situation of bioRxiv adding new articles while I'm indexing.
It appears that lately that situation has been happening a lot, which is probably because of the increase in bioRxiv submissions. This, combined with the fact that bioRxiv sometimes does actually post a duplicated article which will bring my indexing to a halt, I changed the code to recognize and remove any duplicated articles.
This seems to have been working great, except this article which didn't get indexed:
"Stochastic character mapping of state-dependent diversification reveals the tempo of evolutionary decline in self-compatible Onagraceae lineages"
https://www.biorxiv.org/content/early/2018/06/22/210484
My indexing code stops if it sees a duplicated article, which was meant to prevent articles getting indexed more than once.
Essentially I was trying to predict the unfortunate situation of bioRxiv adding new articles while I'm indexing.
It appears that lately that situation has been happening a lot, which is probably because of the increase in bioRxiv submissions. This, combined with the fact that bioRxiv sometimes does actually post a duplicated article which will bring my indexing to a halt, I changed the code to recognize and remove any duplicated articles.
This seems to have been working great, except this article which didn't get indexed: "Stochastic character mapping of state-dependent diversification reveals the tempo of evolutionary decline in self-compatible Onagraceae lineages" https://www.biorxiv.org/content/early/2018/06/22/210484
I don't know what happened here.