lancedb / lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
https://lancedb.github.io/lance/
Apache License 2.0
3.96k stars 219 forks source link

Automatically convert image to tensors in TF data pipeline #1314

Closed wjones127 closed 1 year ago

wjones127 commented 1 year ago

In https://github.com/lancedb/lance/blob/main/python/python/lance/tf/data.py

users should be able to specify they want an image column to be read as a tensor.

wjones127 commented 1 year ago

Alternatively, we should at least make sure it is still a one-liner to convert the image column into a tensorflow tensor.

rok commented 1 year ago

This is now covered for encoded images and fixed shape tensors. Variable shape tensors have their own issue https://github.com/lancedb/lance/issues/1387.