Closed nitzmali closed 3 years ago
Hey @nitzmali, Thanks for spotting this. I had a small mistake in the code that should be fixed in #13 Also published the fix in version 0.2.3, if you want to test it out
Thanks as a lot @jeppe742 for quick response. It perfectly works fine now. Cheers.
from deltalake import DeltaTable df = DeltaTable("path_to_delta_table",file_system=fs).to_pandas()
In the latest release of 0.2.2, I have been trying to read a
delta table from S3 which only updates few rows. When I do a read on full delta table. The dataframe has both initial value and updated value. But, I only need the latest snapshot which is the latest update. Not all the updates ever done. Am I missing something? For validation, I verified by reading through Spark context and It returns only the latest snapshot. Any help?
For reference I have attached a snapshot of read from DeltaTable and read from Spark and the data frame has two and one row respectively.