Closed apmcleod closed 4 years ago
This works on all of the acme degradations. However, using it, I found many errors in those files. I confirmed that those errors are present in the underlying csv, and that this measure_errors.py script outputs sensible results.
Combining the dataset work into amt. This is an attempt at #82 as well.
It's just a Degrader class that you can create with a config.json file or manual parameters, and then you just have to call degrader.degrade(note_df) on any input data point, and it will return a degraded_df and and degradation_id label for you to use as you wish.
I would say that this also fixes #82. Although it's not a Dataset, it does have the "degrade-on-the-fly" functionality.
This mainly implements the measure_errors.py script which generates a json of degradation and clean proportions given a gt and trans directory.
This fixes #30.
On a minor note, this also fixes #110.