lhenneman / hyspdisp

4 stars 4 forks source link

data.table vs tidyverse use. #19

Closed schoolAccountMajaG closed 5 years ago

schoolAccountMajaG commented 5 years ago

I can see that the code is using a lot of data.table. We could think about translating some parts of the code and the Vignette to tidyverse. It would be more user-friendly but slower. Maybe the functions should be written in data.table but the Vignette in tidyverse?

https://www.waldrn.com/dplyr-vs-data-table/

schoolAccountMajaG commented 5 years ago

For example here https://github.com/lhenneman/hyspdisp/blob/master/vignettes/hyads.Rmd#L44-L46 data.table is 4 times quicker than dplyr/tidyverse... How do we prioritize... user experience or speed? (Assuming an average user is more familiar with dplyr)

cchoirat commented 5 years ago

There are some dplyr operations that can be run on data.table object.

Also, there are packages that may help: https://github.com/gdemin/maditr if you decide to stick to data.table.

On Wed, Jun 12, 2019 at 3:23 PM Maja notifications@github.com wrote:

For example here https://github.com/lhenneman/hyspdisp/blob/master/vignettes/hyads.Rmd#L44-L46 data.table is 4 times quicker than dplyr/tidyverse... How do we prioritize... user experience or speed? (Assuming an average user is more familiar with dplyr)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lhenneman/hyspdisp/issues/19?email_source=notifications&email_token=AA73AZKLKNZD5K6QJNPHOMDP2D2GBA5CNFSM4HXIMXFKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXQMU4Y#issuecomment-501271155, or mute the thread https://github.com/notifications/unsubscribe-auth/AA73AZMC4P55TGFPVCHHGB3P2D2GBANCNFSM4HXIMXFA .

lhenneman commented 5 years ago

I think we should prioritize speed within the functions, since there's a lot of reading and writing that happens throughout. Re: changing the vignette, I'm happy either way. We'd have to make sure that the functions played nicely with data.frames as inputs (they may already, but we should check).

cchoirat commented 5 years ago

I'm a huge fan of data.table ;)

On Fri, Jun 14, 2019 at 3:55 PM Lucas Henneman notifications@github.com wrote:

I think we should prioritize speed within the functions, since there's a lot of reading and writing that happens throughout. Re: changing the vignette, I'm happy either way. We'd have to make sure that the functions played nicely with data.frames as inputs (they may already, but we should check).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lhenneman/hyspdisp/issues/19?email_source=notifications&email_token=AA73AZO6MVPXWUBXVTTJFN3P2OPLLA5CNFSM4HXIMXFKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXW3RAA#issuecomment-502118528, or mute the thread https://github.com/notifications/unsubscribe-auth/AA73AZIEADIUP2LCPOI55MDP2OPLLANCNFSM4HXIMXFA .

schoolAccountMajaG commented 5 years ago

Ok, I think we should stick to the data.table then!

cchoirat commented 5 years ago

i think it's possible to use pipes %>% and such with data.table.

On Fri, Jun 14, 2019 at 4:00 PM Maja notifications@github.com wrote:

Ok, I think we should stick to the data.table then!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lhenneman/hyspdisp/issues/19?email_source=notifications&email_token=AA73AZJMD3FD5Y5K3YJQ623P2OP6RA5CNFSM4HXIMXFKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXW36WY#issuecomment-502120283, or mute the thread https://github.com/notifications/unsubscribe-auth/AA73AZIDWOAZVTPW7C6TUBDP2OP6RANCNFSM4HXIMXFA .

schoolAccountMajaG commented 5 years ago

decided to use data.table