Open asfimport opened 2 years ago
Vibhatha Lakmal Abeykoon / @vibhatha: In this context we can also analyze further about data conversions that may be happening within the UDFs for data structures not supported by Arrow. Most of the data science or data engineering applications in the Python space use Pandas or Numpy based data structures, so it won't be a serious problems, but it is nice to keep an eye on possible situations where there are exceptions to these cases.
Vibhatha Lakmal Abeykoon / @vibhatha: Also worth noting the performance limitations in the UDFs executed per each row.
Todd Farmer / @toddfarmer: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.
Apache Arrow JIRA Bot: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.
Need an interface to evaluate the memory footprint, execution time and health of the UDFs and return a meaningful status ex:
Status::HighMemoryUsageException()
,Status::TimeLimitException()
Note: This is also aligned with resource monitoring in the parallel execution space.
Reporter: Vibhatha Lakmal Abeykoon / @vibhatha
Note: This issue was originally created as ARROW-15637. Please see the migration documentation for further details.