manzt / quak

a scalable data profiler
https://manzt.github.io/quak/
MIT License
197 stars 9 forks source link

Support dates histogram #35

Open svittoz opened 1 month ago

svittoz commented 1 month ago

Thanks for this nice library !

In the example you provide (athletes.csv), date_of_birth column histogram is not displayed.

Is it a feature on your roadmap ?

manzt commented 1 month ago

Hi, definitely a feature that would be great to have! I welcome a PR!

manzt commented 3 weeks ago

Hey! I took a look into this today, and I think it's actually an upstream issue in mosaic https://github.com/uwdata/mosaic/issues/484. I'd really like to land the fixes there, as I'd like to avoid vendoring more code.

Right now, I think that histogram will work for some date columns depending on mosaic's auto-binning. I don't know the auto-binning logic enough to know what distributions of datetimes will work, but hopefully we will have more robust support sometime soon.