-
**Describe the bug**
[2024-01-21T17:44:19.558Z] FAILED ../../src/main/python/hash_aggregate_test.py::test_exact_percentile_reduction[[('val', RepeatSeq(Double)), ('freq', Long(not_null))]][DATAGEN_…
-
### Description
I followed the tutorials in https://cloud.google.com/tpu/docs/tutorials/transformer to build a translation model, the Memory Error occurs when I use t2t-datagen with changing problem …
-
### Description
I created a multiprocess_generate supporting problem, using ChoppedTextProblem as a template.
When I actually run the generation, it eventually froze. I used [Pyrasite](http://py…
-
Hello @varshavshetty.
Now since we have the KN1.0 Model with us, We need to See to the Areas where we can improvise our model. Hence You Can Check These Below Ideas to do so
1. **Increase Model Compl…
-
### Issue description
I cannot set the Injection Rate of the Fusion Reactor with the OpenComputer Adapter.
I tried multiple things like reinstalling the modpack(1.12.2 Pack), creating a new compu…
-
Hello,
I'm trying to generate data for streaming benchmarks with genSeedDataset.sh, but it stops because of a number of java.io.FileNotFoundException file:/tmp/hadoop-aldinuc/mapred/local/14873397966…
ghost updated
2 years ago
-
The idea behind this is to have some sort of class that `extends Text` or at least `implements Supplier`. Every instance of this class would automatically register itself for datagen, though it would …
-
Repro:
```
SPARK_RAPIDS_TEST_DATAGEN_SEED=3 ./integration_tests/run_pyspark_from_build.sh -k 'test_decimal_round'
```
(part of) results:
```
Row(round(a, 0)=7.759831900331442e+18, round(1.…
-
From a Databricks [premerge build](https://prod.blsm.nvidia.com/sw-gpu-spark-jenkins/blue/organizations/jenkins/rapids-databricks_premerge-github/detail/rapids-databricks_premerge-github/445/pipeline/…
-
### Is your feature request related to a problem? Please describe.
Simulate data skewness
### Describe the solution you'd like
Use something off-the-shelf
Power law distribution.
Also sin…