Closed tarrow closed 8 years ago
cc #65
Great! Please make sure it applies to all the APIs.
Whoops, I sort of over looked this. Should I be making an arxiv patch on master or on the arxiv update branch?
I've now merged the arxiv branch into master, so you can fetch the changes and work directly in this PR
Great; I'll get right on that!
On Wed, Feb 24, 2016 at 11:37 AM, Richard Smith-Unna < notifications@github.com> wrote:
I've now merged the arxiv branch into master, so you can fetch the changes and work directly in this PR
— Reply to this email directly or view it on GitHub https://github.com/ContentMine/getpapers/pull/75#issuecomment-188213228.
you'll notice in my PR I changed the pageSize param we send to EuropePMC to 100. the default is 25 but it can be as many as 1000. Do you have any thoughts as to what we should set it to? A larger number means fewer requests but it takes EuropePMC longer to respond. At 1000 we can hit the timeout. It also means the progress bar has much less granularity.
This is in this pull request because obviously we need to know the page size to page it.
If you could have a look at this for me @Blahah that would be great. I think it is ok; I'm always keen to get back any criticism you might have about style or logic though.
It should address all apis and enable -k to limit the number of papers.
It also closes #74.
This would be super valuable. In demos it's useful to limit demo to (say) 200 hits.
Subject to code-review by @Blahah this code should now all be ready to go.
Edit: as per my above. I thought this was going to the bug report not the PR. This wasn't meant to be a repeat.
On Thu, Feb 25, 2016 at 11:46 AM, petermr notifications@github.com wrote:
This would be super valuable. In demos it's useful to limit demo to (say) 200 hits.
— Reply to this email directly or view it on GitHub https://github.com/ContentMine/getpapers/pull/75#issuecomment-188748927.
@tarrow sorry it took me a while to get to this.
RE the page size for eupmc - 100 is fine for now. I set the ArXiv one by experimenting with a range of values and choosing the largest one that never timed out in a large number of test runs. But it's probably internet connection speed dependent - we may have to revisit later.
Code all looks good - great job!
This is a patch to limit the number of hits with -k or --limit. It also fixes a possible bug #74