databridgevt / covid19

Data analysis of the 2020 COVID-19 pandemic
MIT License
6 stars 2 forks source link

Somya's Plan #34

Closed sjain1701 closed 4 years ago

sjain1701 commented 4 years ago

By April 24, I am planning on having a chart shows how frequently specific risk factors were mentioned in the comm_use_subset. I am planning on doing this through python. The risk factors I am looking at are pulmonary disease, co-infection, pregnancy, diabetes, and heart disease.

chendaniely commented 4 years ago

I've only really used nltk in the past for these types of tasks, but spaCy is one of the newer nlp python libraries.

let me know if you need help with anything (you can always push your code into a branch and I can try running it on my end)

sjain1701 commented 4 years ago

riskFactors

chendaniely commented 4 years ago

Fixed in #41