Open jonahpearl opened 1 month ago
@chrishalcrow suggested we should fix this at our last meeting. Still in the planning phase of figuring this out. Thanks for the report!
I also want to link this discussion which makes a similar request just so that if there is any debate we know that this is a very requested feature
Just adding my wish for this to come true, too!
Yes, I'm keen for this too!
Hi all — I recently decided to add the
nearest_neighbors
quality metric into my pipeline, but when I tried to compute just that metric, I was annoyed to find that it overwrote all the other previously calculated metrics. This seems like non-optimal behavior — imagine the user computes slow quality metric X, then wants to also add slow quality metric Y the next day, they will also have to re-compute X, or otherwise manually futz with the saved CSVs.This behavior seems to be implemented here. Instead of creating a new df each time, why not check for an existing one, and merge it with any new metrics created? I understand that overwriting is perhaps a better default, in case the user has curated the units or otherwise changed pre-processing, and is trying to compute metrics de novo, but there could be a "keep_existing" kwarg or something that allows the user to specify not to overwrite what's already there.
Thanks!