-
**Microsoft Documentation is Inconsistent with Github in Regards to Release Status**
The Microsoft documentation talks about this project like it is ready for production use.
https://docs.microsof…
-
Hi, I am new to XGBoost on a distributed set up. I am training an XGBoost model with around 200,000 rows and 209,538 columns. The data is present in Hadoop HDFS set up on a 16 node cluster.
My code…
-
Re. https://github.com/snorkel-team/snorkel-tutorials/blob/master/drybell/drybell_spark.py
```
# Generate training labels
logging.info("Generating probabilistic labels")
y_prob = label_model.p…
S-C-H updated
4 years ago
-
Hi,
first of all, I am sorry is this is already known or resolved, I have tried to search a bit before posting and haven't found any related posts.
I am lately encountering an Exception trying …
-
-
I'm getting this error when trying to copy a dataframe to Spark and don't know how to fix it.
```
|================================================================================| 100% 352 MB
Err…
-
I am currently working on a project involving content extraction from a certain number of WARC archives.
I use Archives Unleashed Toolkit to extract plain HTML content (script below) and it works ver…
-
This ticket is to hold information for the homepage of the new docs site, i.e. what is currently at .
![Screen Shot 2020-06-01 at 2 08 07 PM](https://user-images.githubusercontent.com/3834704/83439…
-
I'm successfully using COS storage for saving parquet files via Spark Dataframes similar to the following:
- **newWeatherDF.write.format("csv").parquet("cos://..../ingested//data.parquet")**
How…
-
Hi there ! I'm trying to setup the Hive metastore with Minio as a storage backend in order to be able to write Spark Dataframes to Minio as Hive tables. I'm using ```data.write.saveAsTable("tableName"…