Open turbomam opened 1 year ago
Especially config/curated_slot_notes_by_text_mining.tsv
config/curated_slot_notes_by_text_mining.tsv
text_mining_results/mixs_v6_repaired_term_title_token_matrix.tsv: config/curated_slot_notes_by_text_mining.tsv \ generated_schema/GSC_MIxS_6.yaml schemasheets_to_usage/GSC_MIxS_6_concise_usage.tsv $(RUN) add_notes_from_text_mining \ --dtm-input-slot title \ --input-col-vals-file text_mining_results/mixs_v6_repaired_term_title_token_list.tsv \ --input-dtm-notes-mapping $(word 1,$^) \ --input-schema-file $(word 2,$^) \ --input-usage-report $(word 3,$^) \ --output-schema-file generated_schema/GSC_MIxS_6.yaml.notated.yaml \ --dtm-output $@
There are probably better ways to extract topics with advanced tokenizastion and vectorization and thesaurus lookup
Especially
config/curated_slot_notes_by_text_mining.tsv