SuffolkLITLab / form-explorer

A set of tools for exploring the connections between blank and historic court forms.
https://suffolklitlab.org/form-explorer/
2 stars 0 forks source link

create tool to group and order lists of variables #20

Closed colarusso closed 2 years ago

colarusso commented 2 years ago

Given a list of variable names (mix of standard and nonstandard) return a list of lists where each internal list corresponds to a screen (maybe this should be a dictionary)...

My thinking is that we reCase() the variable names, vectorize them, and then cluster the results. See e.g., https://machinelearningmastery.com/clustering-algorithms-with-python/

colarusso commented 2 years ago

I made a quick and dirty tool. Different clustering algos may work better and it may be that a rules-based approach is best in the end, but you can find it here: https://github.com/SuffolkLITLab/form-explorer/blob/main/101%20Cluster%20Screens.ipynb