rdatatable-community / The-Raft

A blog for the data.table community.
https://rdatatable-community.github.io/The-Raft/
3 stars 10 forks source link

blog about C API #55

Open tdhock opened 2 weeks ago

tdhock commented 2 weeks ago

@aitap since you are familiar with the C API issue https://github.com/Rdatatable/data.table/issues/6180 I was wondering if you may be interested to write a blog for the Raft about the current state of the R C API and how data.table uses it? If so, could you please write a general / gentle introduction, so that even people who are not C API experts (like me) could understand. Target readers would be any current/future data.table developers, as well as other R package developers using the C API.

aitap commented 2 weeks ago

Thank you for the opportunity! I will do my best. We might need a few editing rounds. Is there a target word count? How much of the focus should be on the "non-API" entry points LEVELS, SETLENGTH, SET_GROWABLE_BIT, SET_TRUELENGTH, STRING_PTR, TRUELENGTH?

tdhock commented 2 weeks ago

I'm not sure you need to worry about a target word count. Please take as much space as you need, in order to explain how these different non-API entry points are used, and how/why. If possible please explain which ones can't be easily changed without sacrificing some kind of efficiency. My recent duckdb vs polars comparison is probably too long https://rdatatable-community.github.io/The-Raft/posts/2024-10-17-duckdb_polars_reshape-toby_hocking/

aitap commented 2 weeks ago

Does this skeleton roughly match what you had in mind for the post? Would you like me to also cover the functions data.table has already got rid of (isFrame, [UN]SET_S4_OBJECT, SET_TYPEOF, NAMED)?

tdhock commented 2 weeks ago

looks great. I like the historical introduction. I think it would be worth adding links to the R-devel thread which started the C API declarations earlier this year, which I believe is https://stat.ethz.ch/pipermail/r-devel/2024-April/083349.html yes I think it would be worth discussing the functions data.table has already got rid of, with a brief explanation about how/why they were easy/quick to do.