Open ZuoTiJia opened 1 year ago
We probably have to put some limits. for example postgres fails fast
select rpad('a', 1073741824, 'a');
requested length too large
My Mac is spinning CPU like crazy with the same query in datafusion-cli, I had to kill the process
@ZuoTiJia is such kind of rpad is needed by your real use case or you testing DF limits? :)
Postgres handles this in very interesting manner https://github.com/postgres/postgres/blob/d952373a987bad331c0e499463159dd142ced1ef/src/backend/utils/adt/oracle_compat.c#L282
We probably have to put some limits. for example postgres fails fast
select rpad('a', 1073741824, 'a'); requested length too large
My Mac is spinning CPU like crazy with the same query in datafusion-cli, I had to kill the process
@ZuoTiJia is such kind of rpad is needed by your real use case or you testing DF limits? :)
I'm testing DF and I think it's dangerous for users if there is no limit.
sounds good, I think we can implement quick win like in PG https://github.com/postgres/postgres/blob/c8e1ba736b2b9e8c98d37a5b77c4ed31baf94147/src/include/utils/memutils.h#L42
I'll take this later this week unless no one else volunteers
It looks like the check is implemented https://github.com/apache/datafusion/blob/main/datafusion/functions/src/unicode/rpad.rs#L131
However, when I tried to run the above sql query, I got offset overflow
error from /arrow-array-52.2.0/src/array/byte_array.rs:210:45
.
Is that an OOM issue?
Describe the bug When I use the rpad function, and the parameter of the rpad function is large, it causes OOM. Does Datafusion have a mechanism to limit the resources required for execution.
To Reproduce
Expected behavior A clear and concise description of what you expected to happen.
Additional context Add any other context about the problem here.