onehouseinc / LakeView

Monitoring and insights on your data lakehouse tables
Apache License 2.0
22 stars 7 forks source link

[ENG-15363] Fixing issue of extractor stopping in case of zero batches in current page #127

Closed karankm97 closed 3 weeks ago

karankm97 commented 4 weeks ago

[ENG-15363] Fixing issue of extractor stopping in case of zero batches in current page ->

The extractor starts processing commits from the first incomplete commit and processes the commit group only if the lastModified of any of the files > the lastModified of the lastUploadedFile. If extractor keeps starting from very old incomplete commit, the new commits can grow more than 1000 and in this case, if there were no batches found, the extractor is completely ignoring the subsequent pages. This worked in case of the old approach but in case of CONTINUE_ON_INCOMPLETE_COMMIT approach we need to process the subsequent pages as well. Hence, returning a non null checkpoint to continue processing in case of new approach.

nimahajan commented 4 weeks ago

Task linked: ENG-15363 [ZENDESK #910] UI issue with table prod_analytics_canonical

sonarcloud[bot] commented 3 weeks ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
90.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud