dspinellis / alexandria3k

Local relational access to openly-available publication data sets
GNU General Public License v3.0
79 stars 14 forks source link

Implementation of IssnSubjectCodes for Crossref-2024 with download method #44

Closed panos-span closed 1 month ago

panos-span commented 1 month ago

Title: Implementation of IssnSubjectCodes for Crossref-2024, introducing a new method for data_source called download.

Description: This pull request introduces the IssnSubjectCodes module for Crossref-2024 into the Alexandria3k project. The module processes ISSN to ASJC (All Science Journal Classification) subject codes, supporting queries over its virtual table and enabling the population of an SQLite database with its data. This is a great work-around in order to get the subjects for the works table since crossref doesn't support them anymore.

Key features of this implementation include:

Changes:

Commit Details:

Library Used: This implementation utilizes the pybliometrics library to interact with the Elsevier API. The ASJC codes are retrieved through this API. Users will need an API key and an institutional token to access the Elsevier API. The config file pybliometrics.cfg must be inside the ./config folder of the user.

Testing: Unit tests have been added to verify:

Notes: Ensure that the pybliometrics library is correctly initialized and that the required API keys are configured.

Checklist:

dspinellis commented 1 month ago

This is impressive; well done! Please:

panos-span commented 1 month ago

The API lets you make 9 requests per second, with a limit of 25000 requests per week, when having an institutional token, which needs to be requested by the Elsevier API support team. A user can add many API keys, not just one.

This data source should only be applied for custom examples, when a user has already in mind the range of the ISSN subjects that might be needed. I added the details in the start of the class.

panos-span commented 1 month ago

I implemented the latest recommended changes. I just don't know where to upload a test config file for pybliometrics so the CI will pass.

Once this will be addresed, i will squash my commits.

dspinellis commented 1 month ago

I implemented the latest recommended changes. I just don't know where to upload a test config file for pybliometrics so the CI will pass.

Once this will be addresed, i will squash my commits.

The CI script can copy the config file from tests/data to $HOME. However, note that the config file shall not include any secret (e.g. API) keys. If secrets are needed to test it, then you should mock the response.

panos-span commented 1 month ago

alexandria3k/data_sources/issn_subject_codes.py:89:4: R0913: Too many arguments (6/5) (too-many-arguments)

Should i give less options to the user? What should i do?

Otherwise everything else can be handled immediately.

dspinellis commented 1 month ago

alexandria3k/data_sources/issn_subject_codes.py:89:4: R0913: Too many arguments (6/5) (too-many-arguments)

Should i give less options to the user? What should i do?

panos-span commented 1 month ago

Everything should work as intended now!

dspinellis commented 1 month ago

Note that a data source may (as in your case) or may not (as in all others currently) support a download subcommand. In the second case a suitable error message should be shown.

panos-span commented 1 month ago

If everything is looking good, i will squash again the commits into one, and this time try not to mess with the whitespaces.

panos-span commented 1 month ago

Now, we can accept for the populate method as input the input_path , as well as for the download method. In addition to that, we also check if the populate method signature can take the extra parameter so there wont be any occurring issues. Lastly, the file created is no longer temporary (the path is provided by the user of course).

panos-span commented 1 month ago
panos-span commented 1 month ago

Still work to do I'm afraid. I hope this doesn't discourage you.

No problem! Every suggestion is welcome and appreciated.

panos-span commented 1 month ago

All done!