Adds pipeline to tag documents with topics from a taxonomy
Fixes #55
Modifies 3 scripts:
getters/taxonomies: changes s3 file path for cooccurrence taxonomy
getters/openalex: adds getter to fetch labeled openalex data
getters/patents: adds getter to fetch labeled patents data
Adds 1 script:
pipeline/label_docs_with_taxonomy.py: labels documents with taxonomy labels at all levels using a specified taxonomy and saves outputs either to s3 (default) or locally
Description
Adds pipeline to tag documents with topics from a taxonomy
Fixes #55
Modifies 3 scripts:
Adds 1 script: