Rather than retagging every paper in the 4.5M papers we have as proposed in https://github.com/ResearchHub/issues/issues/12, we could simplify things by only retagging content that's associated with our current users. That is:
Papers associated with comments
Papers previously claimed
We could do so by iterating over each user.author
Why do we need to do this?
Calculating reputation for users in our platform is a priority
Verified users would have their works fetched on the fly when they claim profile
Tasks
[ ] #42
[ ] Tag content with subfields
ℹ️ We can use load_works_from_openalex.py script with --mode backfill
Rather than retagging every paper in the 4.5M papers we have as proposed in https://github.com/ResearchHub/issues/issues/12, we could simplify things by only retagging content that's associated with our current users. That is:
We could do so by iterating over each
user.author
Why do we need to do this?
Tasks
load_works_from_openalex.py
script with--mode backfill