-
I think a future optimization to be considered is removing the requirement for 65 zero bytes to be passed in for the signature of a 1 input transaction. It would add a bit more logic to the rootchain,…
-
Hi , i generated a blif using yosys for full adder. And when i try to use lsoracle for optimization , doesnt seem to give optimized result. For example , i expect the cout to be just one level - major…
-
```
Current implementation of u-s has several limitation on providing several
features to control how JIT should works.
1. A more flexible hotness threshold value. The JIT compiler will be
executed …
-
### 🚀 The feature, motivation and pitch
We followed stock CUDA about grid and block configurations. For these configurations, stock CUDA has some NV GPU arch assumption. Even we followed similar co…
-
**Describe the bug**
I evaluate `OPT-66B` with Zero3 and set offloading to `nvme` which works fine, but I also increased
`max_in_cpu` to 100G
printed as
```
DeepSpeedZeroOffloadParamConfig(dev…
-
It may be worthwhile for `smbus_read_i2c_block_data()` and similar functions to write data into a preallocated buffer (of fixed size `&mut [u8]`, or allowing the function to resize a `&mut [u8]`), rat…
-
**What happened**:
I'm trying to take the result from a `map_blocks` function and store one slice of the resulting array in one zarr array and another slice in another array. I set `compute=Fal…
-
**Describe the bug**
When compiling HDF5 with NVHPC versions 23.5 - 23.9 (additional versions may also be applicable) and with `-O1` (or higher) and `-DNDEBUG`, testing failures occur in the followin…
-
## Open Source Contributors Welcomed!
Please comment below if you would like to work on this issue!
### Contact Details [Optional]
support@zenml.io
## What happened?
Currently, the Vertex o…
-
I noticed that "Tutel v0.3: Add Megablocks solution to improve decoder inference on single-GPU with num_local_expert >= 2", but when I use megablocks in MoE training (dropless-MoE), the following err…