Open ShenJiahuan opened 1 year ago
For this one, @felixwang9817 can help :)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Expected Behavior
Feast should let Spark Structured Streaming to process messages from the earliest offset of Kafka.
Current Behavior
Feast only lets Spark Structured Streaming to processes messages that are produced after the ingestion procedure starts.
Steps to reproduce
Start the Kafka producer first, and then invoke
SparkKafkaProcessor.ingest_stream_feature_view
. The first few messages will not be observed in the output.Specifications
Possible Solution
Set
startingOffsets
toearliest
.https://github.com/feast-dev/feast/blob/870762ae9b78d00f4ea144a9ad6174b2b2516176/sdk/python/feast/infra/contrib/spark_kafka_processor.py#L86
https://github.com/feast-dev/feast/blob/870762ae9b78d00f4ea144a9ad6174b2b2516176/sdk/python/feast/infra/contrib/spark_kafka_processor.py#L109