-
### Describe the enhancement requested
Is there any standard or convention for passing column statistics through the C data interface?
For example, say there is a module that reads a Parquet fil…
-
Alleles are a challenge to represent efficiently in fixed-length arrays. There are a couple of problems:
1. the number of alleles is not known until the whole VCF file has been processed
2. there ca…
-
Dear Support Team,
We encounter a similar issue again, as outlined in [issue #806](https://github.com/manticoresoftware/manticoresearch/issues/806). Following a node crash, we are faced with the in…
-
### Problem description
I would like to be able to track dataframe specific metadata through processing, serialization, and deserialization.
A common use case for dataframe metadata is to store da…
-
### Modin version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest released version of Modin.
- [X] I have confi…
-
### Current Behaviour
# converts the data types of the columns in the DataFrame to more appropriate types,
# useful for improving the performance of calculations.
…
-
basically implement this as an S3 plug-in
https://blog.lancedb.com/chat-with-csv-excel-using-lancedb/
- creata a new queue + dlq
- create a new notification for `.csv` and `.xlsx` files from S3
…
-
Opening this issue to track thoughts on open source data schemas/standards/formats.
**General Questions**
- How are people in the world implementing open-source data standards/schemas and moving data…
-
Problem
=======
As datasets become larger and larger, storing training samples as individual files becomes impractical and inefficient. This can be addressed using sequential storage formats and s…
-
Subscribe to this issue and stay notified about new [daily trending repos in Rust](https://github.com/trending/rust?since=daily)!