Gmousse / dataframe-js

No Maintenance Intended
https://gmousse.gitbooks.io/dataframe-js/
MIT License
460 stars 38 forks source link

need `DataFrame.dropna` function to remove missing rows #74

Closed geyang closed 5 years ago

geyang commented 5 years ago

Would be great if there is a performant way to do this.

The documentation on the API is comprehensive. There exists drop and dropDuplicates, so dropna would probably be a good addition.

Right now I'm achieving this functionality with filter:

const df = new DataFrame({
    y: series.yMean,
    x: series.xData ? series.xData : series.yMean.map((_, i) => i)
  });
  return df
      .filter(row => row.get('y') === row.get('y'))
      .toCollection();

Given that this is a pure javascript library, the performance of this is probably the best you can get. On the other hand, it might get more verbose if multiple columns are concerned, where it would make sense to have a dropna function on the DataFrame.

Gmousse commented 5 years ago

Hi @episodeyang,

Good idea. I will try it and I will publish a proposal here.

Gmousse commented 5 years ago

Hi @episodeyang, It's done, we have two new methods .dropMissingValues and .fillMissingValues.

It's on develop and will be released in the 1.4.0 soon.

geyang commented 5 years ago

Thanks, this is really awesome!

Gmousse commented 5 years ago

Released in 1.4.0