Gmousse / dataframe-js

No Maintenance Intended
https://gmousse.gitbooks.io/dataframe-js/
MIT License
460 stars 38 forks source link

[BUG] Dataframe loaded from file doesn't identify missing values #125

Open itstillon opened 3 years ago

itstillon commented 3 years ago

Describe the bug When data is loaded from a file locally, DataFrame treats all values as string types (empty string) due to which "missing values" are not differentiated/tracked, due to which fillMissingValues() doesn't work as expected.

However, this works when the DataFrame is prepared on the fly (using new DataFrame(...))

To Reproduce Steps to reproduce the behavior:

  1. Copy the content below to a file & save (for e.g. test.csv)

    name,age
    Adam,10
    Amy,
  2. Run the code below (assuming DataFrame dependency is added):

    
    const DataFrame = require('dataframe-js').DataFrame;

(async() => { const df = await DataFrame.fromCSV('test.csv'); df.fillMissingValues(0).show(); })();


**Expected behavior**
Second row should have been updated with `0` as the age.

**Screenshots**
<img src="https://user-images.githubusercontent.com/18084419/115605739-81ec3300-a300-11eb-9057-a26c96c2ed87.png" width="250" height="150">

**Desktop (please complete the following information):**
 - OS: Ubuntu

**Additional context**
This works as expected when the data is prepared in-house, i.e.

_Code:_

const DataFrame = require('dataframe-js').DataFrame;

(async() => { const df = await new DataFrame([ { name: 'Adam', age: 10 }, { name: 'Amy' } ]); df.fillMissingValues(0).show(); })();



_Output:_
<img src="https://user-images.githubusercontent.com/18084419/115606217-0d65c400-a301-11eb-9263-2a2958b179e9.png" width="250" height="150">