misoproject / dataset

JavaScript library that makes managing the data behind client-side visualisations easy
http://misoproject.com
GNU General Public License v2.0
1.18k stars 99 forks source link

Type mismatch problem on Dataset #212

Open dani-lo opened 11 years ago

dani-lo commented 11 years ago

Hi threre

I am trying to bring in Dataset on top of an existing mvc frontend, primarily to use the data parsing functionalities (column operations in particular), with the view of switching the whole data layer to use Dataset (currently relying on Backbone/d3)

My data comes in as a (v large) csv which I then feed to a custom built d3 charting library. All the data comes in as string types (i.e. {"foo" : "23", "bar" : "0.34"}); After parsing the csv to json I build a Miso.Dataset: it straight away complains about the data type of one particular column, so in the constructor I specify the datatype of that column to be string, and then it successfully builds the Dataset, and i.e. I am able to call columns() on it and log it to the console ..

My problem is that further operations on the Dataset thus created, i.e. operations that modify it or return a subset (addComputedColumn, groupBy) keep complaining about the same datatype problem on that particular column, even though I originally specified it to be a string in the options hash ... the message btw is "Uncaught incorrect value '' of type string passed to column 'xxxxx' with type number" and I reckon it is generated because a column may have a numeric value (as in "98") but further down it may be empty (""). Even though this applies to most columns, Dataset complains always about a particular one (i am calling it xxxxx)

Does anybody have an idea of how I can overcome this issue .. ? At the moment I am trying to add a computed column just for testing and I do get this error even though the function passed to addComputedColumn does not reference the column Miso complains about

thanks a lot, any hint appreciated ..

Daniele

ps I know using strings for numeric values is not exactly best practice ..

dani-lo commented 11 years ago

To make an example of the issue I described above, I have replicated the behaviour in a small snippet which illustrates the problem

The code below will throw the error Uncaught incorrect value '' of type string passed to column 'foo' with type number .. I am a bit confused here as I specify the foo column to be string, so I would not expect it to fail. I know it would be ideal to pass all numeric values as numbers and replace empty with 0 but at the moment this is not possible. The code seems to break on fetch, so it actually never reaches the success callback. Any help appreciated, thanks

var tdata = [{"foo" : "34"},{"foo" : "5"},{"foo" : ""}], ds = new Miso.Dataset({ data: tdata, columns : { name : 'foo', type: 'string' } });

ds.fetch({ success: function() { // this.addComputedColumn("testcol", "string", function(row){

        return row.foo;
    })
}

});

protobi commented 11 years ago

Ran into same issue (https://github.com/misoproject/dataset/issues/125). Here's a quick patch that always scans all the values in the column (not just the first five) to determine column type. https://github.com/gradualstudent/dataset/tree/master/dist

dani-lo commented 11 years ago

ok thanks for that, will see if that patch fixes the problem for me ... is there any plan to apply this to the library in future releases .. ? I'm sure it makes sense to typecheck for data operations and reliability, but in a json client/server kind of setup type consistency is not a given in my experience (albeit not in an ideal world I must say) ..