Open maytasm opened 6 months ago
Seems like we used to have something like https://github.com/apache/iceberg-python/commit/4f0a5c6203888ff105c1f09f41c17245f477d2ab but it's gone? @Fokko @TGooch44
Hey @maytasm Thanks for raising this. We don't have the ResidualEvaluator today, but it would be great to add that. We can take inspiration from Java. The code that you're referring to is gone since we have build up the expression system from the ground up.
The evaluators should be already part of the codebase. Are you interested in contributing to this?
@Fokko Thanks for getting back to me. I can look into contributing. I am not too familiar with the new pyiceberg rewrite (current state of this library) but was wondering if it would be something like porting over https://github.com/apache/iceberg-python/commit/4f0a5c6203888ff105c1f09f41c17245f477d2ab#diff-bd871c0e4a5ce5cb7edcb871e4a2b8084e44a432073c25db8b72e3ad4b94e16f ? Or do you see any blocker / difference with the old python residual evaluator and/or adding this to the FileScanTask?
@maytasm The old evaluator might be a good starting point as it is almost a 1-to-1 copy of the Java implementation. I would double check if there are additions to the Java ResidualEvaluator in the meantime
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
@Fokko I am picking this up in #1223 #1388
@tusharchou Thank you, I've removed the stale label 🙌
Question
Table scan returns DataScan. I can call plan_files on DataScan to get a list of FileScanTask. I need to find if there are residual due to the filtering in any of the files? How do I do this? Thanks!