pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
411 stars 128 forks source link

Deprecating the start parameter #93

Closed crew102 closed 5 years ago

crew102 commented 5 years ago

I would suggest deprecating the start parameter. There are a few reasons:

  1. Currently the value of start is always set to 0 by all classses that inherit from Search. In other words, starting the result set at a non-0 value has never actually been an option for users of the package, as the parameter is always set to 0 by Search:

https://github.com/scopus-api/scopus/blob/260adf18ac65baf507bdf5d0b786f40b0d11836f/scopus/classes/search.py#L83

  1. I don't think the above point has been realized until now b/c I imagine it's very rare for the user to want to download a partial result set. In other words, I don't think this parameter has many potential use cases, so supporting it by fixing the issue mentioned above wouldn't even be worth it.

  2. If we did want to allow the user to set start, we'd need to rework how the results are iterated over in the Search class (i.e., rework code shown below). This is b/c the value of 'opensearch:totalResults' doesn't account for the value of start. For example, if my query returns 1,000 records and my start value is 200, the value of 'opensearch:totalResults' will still be 1,000, even though there's a record offset.

https://github.com/scopus-api/scopus/blob/260adf18ac65baf507bdf5d0b786f40b0d11836f/scopus/classes/search.py#L85-L99

Michael-E-Rose commented 5 years ago

Absolutely correct. The count parameter is exclusively for internal use, so we won't need to parameterize that. In the future, the count variable will only be there for non-subscribers, but not as parameter.

Michael-E-Rose commented 5 years ago

And I think we don't really need to deprecate this. Only users that directly implement the Search() class actually see the count variable, and I think no such user exists.

crew102 commented 5 years ago

You're talking about the start param and not count, correct?

Ah, I see how using start only for users who want to extend Search makes sense. Still, if such a user did exist and wanted to set a non-0 value for start, it wouldn't matter b/c start is always set to 0 in Search. So even if they extended Search, they'd have to rewrite the code in search so that start is actually passed along to line 83 shown above.

crew102 commented 5 years ago

Only users that directly implement the Search() class actually see the count variable, and I think no such user exists.

Users of ScopusSearch don't see the start variable, but users of AffiliationSearch and AuthorSearch do, so I think there's a good reason to deprecate.

Michael-E-Rose commented 5 years ago

Yes, I meant start, sorry.

In fact, when I made scopus 1.0 I forgot to remove the parameters altogether. I do believe no one uses them, as there is no proper reason do to. But sure, we can properly deprecate them, I guess this won't hurt.

Do you want to provide a PR as well?

crew102 commented 5 years ago

Sure, I can provide a PR. Do you want to do a full deprecation as discussed in #84, or just remove the param. In my opinion it doesn't need all the fan-fare of a full deprecation, given that no one is probably using the param to begin with.

Michael-E-Rose commented 5 years ago

Let's be on the conservative side and add a flag that we keep for one minor release.

That is, when this PR is live, I accept it and upgrade scopus to 1.5. In 1.6 (maybe one month later) we remove it altogether.

crew102 commented 5 years ago

What do you mean by flag? As in, we do something like this?

https://github.com/scopus-api/scopus/blob/f5c6d32df6c240d1e598f9f4f04fe8cf3f2ec2aa/scopus/abstract_retrieval.py#L579-L583

Michael-E-Rose commented 5 years ago

Exaclty :)