ResearchHub / issues

File an issue or request a feature
0 stars 0 forks source link

Retag all content with OpenAlex subfields #12

Open yattias opened 1 month ago

yattias commented 1 month ago

The goal of this issue is to have every unified_document on the platform tagged with one of the 195 OpenAlex subfields. These subfields will be used to display a user's reputation.

More on subfields and topics https://help.openalex.org/how-it-works/topics

Existing documents and new documents should be all tagged with subfields (including posts, and questions!)

Todos:

The first three todos are prerequisite of the new rep algorithm

yattias commented 1 month ago

@TylerDiorio

An important question here is whether we should only have as many primary hubs as there are subfields. That is, 190 hubs.

So, basically everything else (concepts) become something else like keywords and we only have 190-ish hubs

TylerDiorio commented 4 weeks ago

Hey Kobe, it sounds like the tangible options for "Primary Hubs" would basically be 252 Subfields, 4516 Topics, or ~50K keywords if we're using the OpenAlex topics as our dataset: https://docs.google.com/spreadsheets/d/1v-MAq64x4YjhO7RWcB-yrKV5D_2vOOsxl4u6GBKEXY8/edit#gid=983250122.

I think there's some trade-offs to be had with each: 252 Subfields

4516 Topics

~50K keywords

From a UI perspective, I think it makes the most sense to show the 252 Subfields as "Primary Hubs". From a researcher's perspective, I think it would feel the best to have your keywords incorporated somehow into the Subfields. As a biomedical engineer focused around brain biomechanics, I think my ideal, expected labels would be something like "Primary Hub": Biomedical Engineering (SubField) with my most frequently encountered keywords from my papers:

My thought here is that the keywords are what actually best describe my research interests/expertise so they should be included somehow. The only concern would be too sparsely spreading users away from each other in the current size of the platform, which is maybe where the Subfields as aggregators makes a lot of sense, with the ability to retain that specificity.

Use Case for specificity (keywords):

The subfields alone might be insufficient for the searching on these in a lot of cases, since Biomedical Engineering for example is very broad (devices, cell work, computational models, etc.)

yattias commented 3 weeks ago

Replacing this as a milestone ticket with smaller scope alternative and removing milestone: https://github.com/ResearchHub/issues/issues/40