Open victordibia opened 1 year ago
Is the goal here to allow users to upload their own datasets or to offer a platform for data analysis from a bank of "pre"-provided ready datasets?
Thanks Aiden. I am leaning more towards supporting discovery of data as opposed to hosting data (we probably can assume the user is able to do this already).
I updated the initial description to add more information
Cool! This is what we’re doing at wobby.ai We ingest tons of public data, enrich it with AI and let you analyze it.
Right now were working with journalists, making it easy for them to find data stories in public data.
Would be cool to see this in LIDA. Love this project :)
Check us on:
What
Data analysis and exploration typically begins with the assumption that the right dataset exists. For many scenarios, this assumption holds (e.g., organizational data already exists is a tidy csv or json file). However, for other use cases, the right dataset may not exist and needs to be found.
The high level goal of this functionality is
How
Supported approaches may include the following:
Possibly start off with a a base DataFinder class (find method), HeuristicsDataFinder subclass, AgentDataFinder subclass.
p.s. if you are interested in working on this, please share thoughts on your general approach for discussion and comment.