Closed beingkk closed 1 year ago
Thanks a lot @ampudia19, will implement your suggestions!
Thanks again @ampudia19, I fixed the issues highlighted above, namely:
getters.novelty.py
and adding defaults in the doctstringnovelty_utils.py
, calculate_openalex_novelty.py
and openalex_topic_novelty.py
upload_to_s3
variable to the pipelines (default will be True).Hope this is OK to merge @ampudia19 ?
Re topic names: I'm happy to add an adjustment to use chatgpt topic names via another issue #71 (perhaps once you've merged the corresponding PR). Hope that's alright?
All is looking good, Karlis :)
Closes #64
The main addition is the pipeline for calculating a topic-level novelty score, characterising the "uncommonness" of the research related to a given topic (in a given year).
The usage is as follows:
This will output five tables (on per each taxonomy level), with two alternative novelty scores per topic, per year.
At the moment, this is based only on the OpenAlex data.
In a forthcoming issue/PR, I will apply the same analysis on patent data, to generate novelty scores using patents as well.
Checklist:
notebooks/
pre-commit
and addressed any issues not automatically fixeddev
README
s