spark-columns Search Results

1000+ results
for spark-columns

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/SynapseML #936

code block LightGBMClassifier fit on yarn

**Describe the bug** 1. code block in pipeline_model.fit(), No progress, spark stage always 0 2. csv data : column 200, row 800000 3. one centos compute train cost time: 1min **To Reproduce**…

rusonding updated 3 years ago
1
Netflix/iceberg #21

Partition schema mangling for ORC

omalley updated 6 years ago
4
trinodb/trino #6382

Native ("optimized") Parquet writer

- [x] Add native parquet writer #2004 - [x] Parquet files written by presto-parquet can't be read by parquet-hadoop library used in Spark #6377 - [x] Native Parquet Writer writes Parquet V2 files th…

findepi updated 6 months ago
3
microsoft/LightGBM #2363

JVM crash caused by LGBM_DatasetCreateFromCSRSpark

I'm seeing jvm crashes in our spark cluster which I believe are being caused by `LGBM_DatasetCreateFromCSRSpark` https://github.com/microsoft/LightGBM/issues/2360 indicated some issues in that met…

chris-smith-zocdoc updated 4 years ago
18
microsoft/hyperspace #263

[PROPOSAL]: Support variable schema for included columns

## Problem Description This design proposal is for adding feature request #229. Currently, Hyperspace supports creating indexes only on data with fixed schema. This means: - All columns from "…

pirz updated 3 years ago
7
mlflow/mlflow #4578

[FR] Support for optional columns and sequential data types …

## Willingness to contribute The MLflow Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature (ei…

jan920 updated 1 year ago
3
py4j/py4j #338

[IMPROVEMENT] Add something like py4j.redirect_stdout to rec…

Hi bartag and contrib, First off, thank you so much for making py4j! I use pyspark as part of my job and it's a lifesaver in terms of code reuse with the rest of our products and with building on t…

yingw787 updated 6 years ago
2
microsoft/SynapseML #1618

LightGBM classifier training keeps failed, spark executor ex…

### SynapseML version 0.10.0 ### System information - **Language version** (e.g. python 3.8, scala 2.12): python3 - **Spark Version** (e.g. 3.2.2): 3.0 - **Spark Platform** (e.g. Synapse, Databri…

Jidong0726 updated 1 year ago
24
NVIDIA/spark-rapids #4820

[FEA] Support data type org.apache.spark.mllib.linalg.Vector…

I wish we can support data type org.apache.spark.mllib.linalg.VectorUDT. Mini repro: ``` import org.apache.spark.sql.types._ val rows = spark.sparkContext.parallelize( List( Row(0.0, 1…

viadea updated 2 years ago
5
OpenRefine/OpenRefine #592

Feature request - Row Deduplication

_Original author: tony.hi...@gmail.com (August 02, 2012 18:46:43)_ Removing rows that are duplicate in one or more columns is clunky workaround. My intuition when I went looking for a dedupe opt…

tfmorris updated 4 years ago
5

上一页 1...90 91 92 93 94 95 96...100 下一页

1000+ results for spark-columns

1000+ results
for spark-columns