cu-mkp / sandbox

The “Sandbox” space makes available a number of resources that utilize and explore the data underlying "Secrets of Craft and Nature in Renaissance France. A Digital Critical Edition and English Translation of BnF Ms. Fr. 640" created by the Making and Knowing Project at Columbia University.
https://cu-mkp.github.io/sandbox/
6 stars 1 forks source link

normalized visualization of tags by category frequency #66

Open njr2128 opened 2 years ago

njr2128 commented 2 years ago

Visualizations like the number of deletions and additions by category are really useful, but it would be really interesting to see normalize the data to see whether they are over/underrepresented in a certain category. Obvivously, there are many adds/dels in casting, but that is bc there are more casting entries than any other category. If we were to normalize this, are there any categories that have more dels/adds than others?

Can we do this for many of the charts in https://cu-mkp.github.io/sandbox/docs/Kaufman_final-report.html? image

njr2128 commented 2 years ago

Even just doing percentages might help

njr2128 commented 2 years ago

Another way would be to calculate average/mean of tag of interest across ALL categories and then look at deviations in each category

njr2128 commented 2 years ago

A way to think about or explain this: what is the probability that a certain tag will occur in a chosen category?