dotnet / datalab

This repo is for experimentation and exploring new ideas involving ADO.NET, EF Core, and other areas related to .NET data.
MIT License
198 stars 24 forks source link

guidance on an Apache Arrow Layer #17

Closed caljnj closed 3 years ago

caljnj commented 3 years ago

Hi

I'm working off a comment by yzorg regarding integration of Apache Arrow into this project, and the answer was that this would be implemented at a higher layer than driver level.

I was after some pointers really,. my aim here is to intercept the ODBC driver's storage mechanism, and store it to "feather" format instead (a SIMd memory-optimized file format).

can you recommend:

Thanks!

roji commented 3 years ago

@caljnj modern approaches to database drivers in .NET typically don't include ODBC - high-performance drivers and experiments are currently fully managed, 100% .NET implementations.

Some more context on what you're planning to achieve would be useful; are you looking to create some sort of database driver over Apache Arrow, or something that bridges/integrates a relational database (e.g. SQL Server) with Apache Arrow? If it's the former, a .NET Apache Arrow library seems to already exist.

caljnj commented 3 years ago

sorry! misunderstood what you were all up to here. what i was after i guess was to do the layer above the driver. Excactly like turbodbc does for Python. but with bindings for the R language. https://arrow.apache.org/blog/2017/06/16/turbodbc-arrow/