Open laurencewells opened 3 years ago
Thanks for pointing out the error in the documentation. I'll update the pyspark doc to fix this.
Hi,
We ran into the same surprise yesterday, and did not understand why the job is starting at the end of the stream all the time. This led to time wasted debugging and making workarounds. Could it be a bit prioritized? It is really confusing for pyspark users.
Thanks! Best Regards, Stefan Prisca.
@stefanprisca Same, we sank a good hour or two into debugging what was happening
Hi Listing the issue for anyone else having the same issue,
For the Pyspark documentation here: azure-event-hubs-spark/docs/PySpark/structured-streaming-pyspark.md
it references that the default starting position is start of stream:
The behaviour we were seeing is this is not the case and start of stream needs to be specifically set to stream from the start. The default behaviour, if nothing was entered was the end of stream.