h2oai / db-benchmark

reproducible benchmark of database-like ops
https://h2oai.github.io/db-benchmark
Mozilla Public License 2.0
323 stars 85 forks source link

add dtplyr that means data.table with dplyr syntax. #76

Closed mrchypark closed 5 years ago

mrchypark commented 5 years ago

How about dtplyr?

dtplyr package provide function that can use dplyr syntax with data.table class.

https://github.com/hadley/dtplyr

jangorecki commented 5 years ago

Thanks for filling this suggestion. The only problem now about dtplyr is that it is not actively maintained. There are few commits in February but not really what is required there. As of now it make sense to run it once to see how it performs but doesn't make sense to have that in benchmark scheduler.

mrchypark commented 5 years ago

Make sense. Thank you for consider this option.

jangorecki commented 5 years ago

best will be to get back to this when dtplyr will be maintained again

jangorecki commented 4 years ago

dtplyr is actively maintained again. Although I don't think so we should be adding it to benchmark. We could assume it will be similarly fast as data.table. We don't want to benchmark translation layer (which dtplyr is) but algorithm implementations. Otherwise it would also make sense to add sparklyr, spark from scala, etc.