rempsyc / busara_dashboard

The Missing Majority in Behavioural Science Dashboard
https://remi-theriault.com/dashboards/missing_majority
1 stars 0 forks source link

Deal with PloS One #24

Closed rempsyc closed 3 months ago

rempsyc commented 6 months ago

It would be nice to be able to deal with the PloS One problem. The general strategy to deal with the amount of journals we have is to loop per year (otherwise, the data is too large and can't be fetched). For Science and Nature, that was still too big, so we had to loop per month... for Plos One, even looping per month wasn't enough.

I wonder if we should attempt to loop per day if we really want to include it... hopefully that would be enough to deal with it. this is a bit more complicated since the number of months is stable, but the number of days every month changes.

The other alternative would be to include the historical data that was successfully fetched, but that's only for years 2007 to 2011.

I also wonder otherwise if we should really have one dashboard per discipline. (1) Dashboard: Psychology (only psych journals). (2) Dashboard: Economics (only economics journals). (3) Dashboard: Interdisciplinary (only general journals).

rempsyc commented 6 months ago

This could potentially be fixed by #25

New features of easyPubMed version 3.01 Automatic Job splitting into Sub-Queries. The Entrez server imposes a strict n=10,000 limit to the number of records that can be programmatically retrieved from a single query. Whenever possible, the easyPubMed library automatically attempts to split queries returning large number of records into lists of smaller, manageable queries.