Open jlowe opened 1 month ago
Here's a naive approach to identifying candidates for unshimming, which is seeing how many files under sql-plugin/src/main/spark320/ have no other peers after #11159:
$ cd sql-plugin/src/main/spark320
$ for i in $(find . -type f);do count=$(ls ../*/$i | wc -l); if [[ $count == 1 ]];then echo $i;fi;done
./scala/com/nvidia/spark/rapids/v1FallbackWriters.scala
./scala/com/nvidia/spark/rapids/shims/GpuOrcDataReaderBase.scala
./scala/com/nvidia/spark/rapids/shims/Spark320PlusShims.scala
./scala/com/nvidia/spark/rapids/shims/HashUtils.scala
./scala/com/nvidia/spark/rapids/shims/YearParseUtil.scala
./scala/com/nvidia/spark/rapids/shims/CudfUnsafeRowBase.scala
./scala/com/nvidia/spark/rapids/shims/ShimBaseSubqueryExec.scala
./scala/com/nvidia/spark/rapids/shims/OffsetWindowFunctionMeta.scala
./scala/com/nvidia/spark/rapids/shims/ShimAQEShuffleReadExec.scala
./scala/com/nvidia/spark/rapids/shims/extractValueShims.scala
./scala/com/nvidia/spark/rapids/shims/TreeNode.scala
./scala/com/nvidia/spark/rapids/shims/ShimPredicateHelper.scala
./scala/com/nvidia/spark/rapids/shims/gpuWindows.scala
./scala/com/nvidia/spark/rapids/shims/TypeSigUtil.scala
./scala/com/nvidia/spark/rapids/shims/GpuOrcDataReader320Plus.scala
./scala/com/nvidia/spark/rapids/shims/RapidsCsvScanMeta.scala
./scala/com/nvidia/spark/rapids/shims/Spark31Xuntil33XShims.scala
./scala/com/nvidia/spark/rapids/shims/AnsiCastRuleShims.scala
./scala/com/nvidia/spark/rapids/shims/RebaseShims.scala
./scala/com/nvidia/spark/rapids/shims/GpuBatchScanExecBase.scala
./scala/com/nvidia/spark/rapids/shims/OrcShims320untilAllBase.scala
./scala/com/nvidia/spark/rapids/shims/Spark320PlusNonDBShims.scala
./scala/com/nvidia/spark/rapids/shims/spark320/SparkShimServiceProvider.scala
./scala/com/nvidia/spark/rapids/spark320/RapidsShuffleManager.scala
./scala/org/apache/spark/rapids/shims/GpuShuffleBlockResolver.scala
./scala/org/apache/spark/rapids/shims/ShuffledBatchRDDUtil.scala
./scala/org/apache/spark/rapids/shims/storage/ShimDiskBlockManager.scala
./scala/org/apache/spark/sql/rapids/shims/misc.scala
./scala/org/apache/spark/sql/rapids/shims/Spark32XShimsUtils.scala
./scala/org/apache/spark/sql/rapids/shims/AvroUtils.scala
./scala/org/apache/spark/sql/rapids/shims/RapidsQueryErrorUtils.scala
./scala/org/apache/spark/sql/rapids/shims/RapidsShuffleThreadedWriter.scala
./scala/org/apache/spark/sql/rapids/shims/datetimeExpressions.scala
./scala/org/apache/spark/sql/rapids/GpuDataSource.scala
./scala/org/apache/spark/sql/execution/ShimTrampolineUtil.scala
./scala/org/apache/spark/sql/hive/rapids/shims/GpuInsertIntoHiveTable.scala
./scala/org/apache/spark/sql/hive/rapids/shims/GpuCreateHiveTableAsSelectCommand.scala
./scala/org/apache/spark/storage/RapidsShuffleBlockFetcherIterator.scala
./scala/org/apache/spark/storage/RapidsPushBasedFetchHelper.scala
./java/com/nvidia/spark/rapids/shims/ShimSupportsRuntimeFiltering.java
./java/com/nvidia/spark/rapids/shims/XxHash64Shims.scala
Is your feature request related to a problem? Please describe. After #11159, there are a number of classes that are under a spark-specific shim directory but are now common across all supported Spark versions.
Describe the solution you'd like Shimmed classes that are now common should be moved to the standard paths and shim directives removed to simplify the code base.Identify common classes. Base classes or traits that only existed for shim reasons and no longer need to be shims should be removed (e.g.: ShimUnaryExpression and many other classes in TreeNode.scala).