ropensci / openalexR

Getting bibliographic records from OpenAlex
https://docs.ropensci.org/openalexR/
Other
91 stars 20 forks source link

How can I use filter and search in `oa_query()` Wirth query to long? #163

Closed rkrug closed 11 months ago

rkrug commented 12 months ago

Hi

I want to use OpenAlex to replace WoS in my workflow. What I would like to do as a starting point, search for terms, e.g:

biodiversity AND Water AND (management OR Change)

only much longer.

The term consists of three sub-terms, combined with AND

st = s1 AND s2 AND s3

But the resulting URL is much to long (Request Line is too large (6038 > 4094)).

How can I split the query or partition it, so that I can run the complete query st?

Thanks.

yjunechoe commented 12 months ago

In theory, the intersection of (A, B, C, D) is equivalent to the intersection of (A, B) and (C, D).

So you could split it into two queries: one for A and B and another for C and D, and then inner join the resulting two data frames (by the ID column as the key) such that you only keep the papers that overlap. The subset of papers you end up with will be the ones that satisfy the A and B and C and D condition.

Of course, this is only equivalent in theory and assumes that the conditions are evaluated independently by the API, but I think this should give you a good approximation.

rkrug commented 11 months ago

Thanks. I will look into that.