-
We need to enhance the `next_access` and `previous_access` for what we need for various issues related to NG-Arch. We'll use a dataflow analysis approach to do this.
-
Many tasks of heterogenous computing are variations of the same well-known patterns that differ only in functions called per element.
Like, reduce, for example, when SoftMax and RMS normalization, in…
-
**Describe the bug**
I tried running deepspeed zero 3 on a new huggingface model and got the following error:
[2023-12-13 04:12:18,837] [WARNING] [parameter_offload.py:86:_apply_to_tenso…
-
# Motivation
To run a deep learning model, the mainstream approach in TVM is to do code generation with an auto-tuner, which make the performance optimization process easier for developers without do…
-
Does lucid support to make my own TensorFlow model visualizations now?And how to make my own channel spritemaps?
Thanks!
-
It seems to me that rayon's awareness of iterators, types and mutability ought to make automatic GPU offloading a feasible project.
An idea would be to use compiler plugins/proc-macros that precom…
-
Dear Mark,
I am trying to commission my FFF linac using your repository. After generating the machine data, in validation step, the PDDs have good match on the measured ones for all fields. However…
-
[Grouped Query Attention](https://arxiv.org/abs/2305.13245) improves parameter-efficiency of attention KV projections and reduces IO at inference-time, making inference faster.
It can be implemente…
-
@naibaf7 @bhack
I open this ticket in order to discuss ideas for integration libdnn to tiny-cnn.
Currently, I implemented a small interface to get native OpenCL context from tiny-cnn:
https://github…
-
Please make sure that this is a bug. As per our
[GitHub Policy](https://github.com/tensorflow/tensorflow/blob/master/ISSUES.md),
we only address code/doc bugs, performance issues, feature requests a…