probcomp / BayesDB

A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself. New implementation in http://github.com/probcomp/bayeslite
http://probcomp.csail.mit.edu/software/bayesdb/
Apache License 2.0
888 stars 52 forks source link

Memory leak when running CREATE MODEL... on table with ~ 30 columns #12

Closed huroh closed 8 years ago

huroh commented 10 years ago

[Using VM provided]

CREATE BTABLE training FROM data.txt; CREATE 10 MODELS FOR training;

https://dl.dropboxusercontent.com/u/68514/bayesdb/data.txt

huroh commented 10 years ago

If I cut down to around ~7 columns it works.

huroh commented 10 years ago

I've noticed I also get memory leaks when running CREATE MODEL when I use 'NA' instead of 'nan' to indicate missing values, in a column that is otherwise numerical. Let me know if you need an example for this.

jbaxter commented 10 years ago

Again, thanks for helping us find all these bugs. I am not sure we explicitly support 'NA' to indicate missing values right now, but we should definitely add support for that.

Also, I am curious: what is the data you're trying to analyze? Is it some type of psychology experiment with a variety of tasks, and how well each subject did on each task?

huroh commented 10 years ago

Not psychology, it's educational analytics, the aim is early prediction of student success. I'm interested in comparing the methods in bayesdb with various machine learning models I've already been using.