Kotlin / dataframe

Structured data processing in Kotlin
https://kotlin.github.io/dataframe/overview.html
Apache License 2.0
846 stars 63 forks source link

Prepare a survey (or GitHub Discussion) about data sources #408

Open zaleslaw opened 1 year ago

zaleslaw commented 1 year ago

The draft list of data sources:

  1. SQL Databases based on JDBC
  2. XML
  3. Protobuf
  4. Parquet
  5. ORC
  6. SparkSQL
  7. different files on the FileSystem
  8. NoSQL databases (MongoDB, Cassandra, Ignite)
  9. Queues (Kafka)
  10. Amazon (S3)
  11. Arrow IPC (Feather v2)
  12. Apache Avro
Jolanrensen commented 1 year ago

Probably move this to a discussion so people can upvote and leave others :)

zaleslaw commented 1 year ago

@Jolanrensen sorry, I want to have a Google Form. Add there some different questions. It's better for analysis.

Jolanrensen commented 1 year ago

sure, but it might also be nice for the community to see which types of databases other people are interested in

zaleslaw commented 1 year ago

Nice to prepare the notebooks with the results:)

zaleslaw commented 1 year ago

@Jolanrensen will you share something?

Jolanrensen commented 1 year ago

Maybe we should add Exposed to the list as a data source. It was suggested here first and seems to cover several DB types

Jolanrensen commented 1 year ago

Also, for people wanting to do heavy operations with lots of large columns, we might want to provide interop with Multik as well

koperagen commented 1 year ago

Maybe something like Google Sheets

Jolanrensen commented 1 year ago

Maybe something like Google Sheets

Like integration with their API? Could be easy, since we already have Excel support.

koperagen commented 1 year ago

Maybe something like Google Sheets

Like integration with their API? Could be easy, since we already have Excel support.

Yes, i think it might be a good step for building data processing pipelines. For example, read some data, transform with dataframe, write to a Google sheet. Or have a Google Sheet edited by a human and run dataframe processing on it when needed. Since we have Excel support, if this integrations proves to bring too little value, we can also consider to only have a tutorial. I mostly want to add it not because it's impossible to do now, but to bring attention to possible applications of our library

Jolanrensen commented 1 year ago

XML would also probably need OpenAPI support, similar to JSON

belovrv commented 1 year ago

I would also add yaml in the list