joelansbro / pipeline

API Pipeline DB middleware
2 stars 0 forks source link

Create keyword frequency job #21

Closed joelansbro closed 2 years ago

joelansbro commented 2 years ago

Changes: Fixes duplication error by loading in data differently in the keywordjob. Insert a reportjob to return the most common keywords from a dataset Update 1234.json into schema.json and uses it in the intakejob

cleanjob() Remove the _SUCCESS file that generates Add more cleaning regexes Add a function that replaces the datasets word count with a more accurate one

joelansbro commented 2 years ago

Requesting changes to apply to the keywordjob. Originally had a description which the new one has missed off.