go-gota / gota

Gota: DataFrames and data wrangling in Go (Golang)
Other
3.04k stars 281 forks source link

Combining filters with AND and user-defined filters #99

Closed chrstphlbr closed 4 years ago

chrstphlbr commented 5 years ago

This PR refines filtering of DataFrames: (1) support for combining filters with AND without having to chain the filters, which should be more performant as rows only need to be traversed once (df.FilterAggregation) (2) support for user-defined filters (series.CompFunc and func(el series.Element) bool) (3) test cases for both new features (4) updated README reflecting the additions

It should be backwards compatible as df.Filter retains OR semantics.

Cheers Christoph

kniren commented 4 years ago

Hello @chrstphlbr, sorry that it took me this long to review this issue. Looks like you put a lot of good work into this PR and I will be happy to merge this. If you can fix the merge conflicts currently present I will get this in ASAP.

chrstphlbr commented 4 years ago

Hi @kniren, I will make some time towards the end of the week to address the merge conflicts.

chrstphlbr commented 4 years ago

Hi @kniren, well, I pushed a non-conflicting version just now :)

kniren commented 4 years ago

@chrstphlbr If you think there is no problems with including modules at this point, I would retract my comment. I'm just not familiar with modules in go yet, so I wanted to make sure it was fine to proceed like this.

chrstphlbr commented 4 years ago

Quite the opposite: I think Go modules are (almost) a necessity to go forwards. From my personal experience with using gota (with modules) from other projects, everything works smoothly. Modules should be finalized in the upcoming Go version (1.14) BUT will also support go.mod files from versions 1.11, 1.12, and 1.13 [1].

[1] https://github.com/golang/go/wiki/Modules

kniren commented 4 years ago

I understand, but I'm not sure how it actually works and need to do a bit more reading. For example, when looking at this PR vs #103 I see different go.mod and go.sum files. Which one is correct? Why is gonum.org/v1/plot or github.com/golang/freetype in there if it is not being used by this project? Is this due to different users being developing locally?

I would like to avoid that individual user preferences leak into the repo. Furthermore, I would also like to make sure that the SemVer I'm currently using works automatically with the modules or if I need further action.

As I said, if you think there are no other problems, I will get this in right away.

chrstphlbr commented 4 years ago

Also, #97, which introduced go.mod and go.sum, was already merged into dev. Bottom line, I think we are good to go with modules :)

kniren commented 4 years ago

We can keep the discussion going with regards to modules, but I guess it makes no sense to keep dwelling on it for now.

If something breaks, we will fix it on a different PR.

I will try to learn a bit more about modules before the next release.

Thanks once again for your work and patience!

;)

chrstphlbr commented 4 years ago

No worries. Happy to have contributed something meaningful.

Maybe running go mod tidy before every push/commit/PR should be best practice. This removes all unnecessary dependencies in go.mod and updates go.sum (unfortunately I also forget about that in this PR). But I am also not sure what best practice is.

kniren commented 4 years ago

Yeah, I just set up TravisCI with a simple go test in it. I will add some checks for style and consistency to the build file to avoid issues in the future.