spark-columns Search Results

1000+ results
for spark-columns

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/SynapseML #936

code block LightGBMClassifier fit on yarn

**Describe the bug** 1. code block in pipeline_model.fit(), No progress, spark stage always 0 2. csv data : column 200, row 800000 3. one centos compute train cost time: 1min **To Reproduce**…

rusonding updated 3 years ago
1
apache/datafusion #2293

Single File Per ParquetExec, AvroExec, etc...

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Part of #2079 Following on from #2292 and #2291 it should be possible to pull the multi…

tustvold updated 2 years ago
4
microsoft/SynapseML #304

Featurizer should provide option to pass through missing val…

Hi! Using lightGBM I faced another problem. I'm not sure if it is bug or feature :) but in our data we have a lot of empty values, so before we used sparse vector to store features, and it worked fine…

ekaterina-sereda-rf updated 6 years ago
5
apache/hudi #7839

[BUG] the deleted data reappeared after clustering on the ta…

**Environment Description** * Hudi version : 0.12.2 * Spark version : 3.2.2 * Hadoop version : 2.7.3 * Storage : hdfs **Describe the problem you faced** I have a hudi table and I delet…

MihawkZoro updated 1 year ago
9
facebookincubator/velox #5770

Incompatibility between Spark and Velox Data Types Causing R…

## Problem Description When utilizing Velox to read data from Spark, we've observed that certain data types are not represented identically between Spark and Parquet files. This discrepancy results…

srinivasst updated 1 year ago
3
microsoft/responsible-ai-toolbox #1652

Running Dashboards on Azure Databricks

**Describe the bug** I am trying to run ResponsibleAIDashboard on Azure Databricks Notebook. I have a cluster and I have installed raiwidgets library in the cluster. When I do this - ResponsibleAI…

achandak33 updated 2 years ago
1
facebookincubator/velox #1573

Adding validation logic in Substrait-to-Velox conversion

A Substrait plan validation can be used to determine whether to offload the computing to Velox. If the validation fails, the compute will fallback to Spark for execution. Instead of just trying the…

rui-mo updated 2 years ago
9
py4j/py4j #338

[IMPROVEMENT] Add something like py4j.redirect_stdout to rec…

Hi bartag and contrib, First off, thank you so much for making py4j! I use pyspark as part of my job and it's a lifesaver in terms of code reuse with the rest of our products and with building on t…

yingw787 updated 6 years ago
2
microsoft/hyperspace #263

[PROPOSAL]: Support variable schema for included columns

## Problem Description This design proposal is for adding feature request #229. Currently, Hyperspace supports creating indexes only on data with fixed schema. This means: - All columns from "…

pirz updated 3 years ago
7
Netflix/iceberg #21

Partition schema mangling for ORC

omalley updated 6 years ago
4

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for spark-columns

1000+ results
for spark-columns