-
**Describe the bug**
1. code block in pipeline_model.fit(), No progress, spark stage always 0
2. csv data : column 200, row 800000
3. one centos compute train cost time: 1min
**To Reproduce**…
-
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
Part of #2079
Following on from #2292 and #2291 it should be possible to pull the multi…
-
Hi! Using lightGBM I faced another problem. I'm not sure if it is bug or feature :) but in our data we have a lot of empty values, so before we used sparse vector to store features, and it worked fine…
-
**Environment Description**
* Hudi version :
0.12.2
* Spark version :
3.2.2
* Hadoop version :
2.7.3
* Storage :
hdfs
**Describe the problem you faced**
I have a hudi table and I delet…
-
## Problem Description
When utilizing Velox to read data from Spark, we've observed that certain data types are not represented identically between Spark and Parquet files. This discrepancy results…
-
**Describe the bug**
I am trying to run ResponsibleAIDashboard on Azure Databricks Notebook. I have a cluster and I have installed raiwidgets library in the cluster.
When I do this - ResponsibleAI…
-
A Substrait plan validation can be used to determine whether to offload the computing to Velox. If the validation fails, the compute will fallback to Spark for execution.
Instead of just trying the…
-
Hi bartag and contrib,
First off, thank you so much for making py4j! I use pyspark as part of my job and it's a lifesaver in terms of code reuse with the rest of our products and with building on t…
-
## Problem Description
This design proposal is for adding feature request #229.
Currently, Hyperspace supports creating indexes only on data with fixed schema. This means:
- All columns from "…
-