NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
822 stars 235 forks source link

Support invalid partToExtract for parse_url #11661

Closed thirtiseven closed 3 weeks ago

thirtiseven commented 4 weeks ago

Closes #11659

In parse_url we used to fallback to cpu if the partToExtract was not valid. However, the behaviour of cpu is just to always return null, we can also do that easily.

But in customer case #11659, it did run on gpu because GpuOverrides thought lowercase path was valid, which is not, so it went to a defensive branch which threw an exception.

This pr supports invalid partToExtract in parse_url running on gpu, and fixes the bug in its override.

thirtiseven commented 4 weeks ago

build