-
See https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html
-
Hi, I have seen #118 and also read the documentation which says
```
Note: Due to limitations in Spark, metadata modification is unsupported in
the Python, SQL, and R language APIs.
```
While I u…
-
-
With #3284, we can now read ORC files into dask dataframes. It would be good/interesting to benchmark this implementation and see if there are any easy gains we're missing (this was never done). This …
-
Hi,
I are using this "s3-sqs" connector with spark structured streaming and deltalake to process incoming data in partitioned s3 buckets.
The problem I are facing is with "s3-sqs" source is that the…
-
Reading/writing dataframes to/from avro files with timestamps with spark gives an inconsistent behavior. The point being that the conversion between timestamps and integers has a different logic in sp…
-
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
- Join the mailing list to engage in conversations and get faster support at dev-subscribe@h…
-
[Ibis](https://ibis-project.org) is a portable Python dataframe library, initially created by the same creator of pandas. Today, it's a well-maintained project that supports 20+ backends including pan…
-
Hi,
I was just wondering if you have plans of making this code more generic for other use cases?
like you have created a case class for this particular use case. but if i have to perform a cdc on 20 …
-
Currently trying to join to dataframes with the following command:
`val df_green_pickup = green_data.join(neighborhoods).where($"pickup_point" within $"polygon")
display(df_green_pickup)`
Havin…