large-kernels Search Results

1000+ results
for large-kernels

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Starfish-develop/Starfish #36

Different local kernels

_The problem:_ Not all residual spectrum outliers originate solely from line strength mismatches. In general, line _width_ mismatches or line center shifts will be present in the residual spectrum as…

gully updated 5 years ago
2
vllm-project/vllm #125

Implement custom kernels for top-k and top-p sampling

As mentioned in https://github.com/WoosukKwon/cacheflow/pull/81#issuecomment-1546980281, the current PyTorch-based top-k and top-p implementation is memory-inefficient. This can be improved by introdu…

WoosukKwon updated 1 week ago
19
diku-dk/futhark #2091

Use command buffers to reduce idle time

GPUs are hungry pieces of hardware and want a steady supply of commands. Many practical algorithms involve many interations where each iteration launches one or more kernels that are by themselves no…

FluxusMagna updated 6 months ago
3
ginkgo-project/ginkgo #1260

PowerPC build: 14 tests failed out of 231

I have rebuilt `ginkgo` from the latest commit in master, same results as from the last release: ``` ---> Testing ginkgo Executing: cd "/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math…

barracuda156 updated 2 weeks ago
19
martinwholtmon/IT3920-2024-Master-MSIT #16

Integrating DeepSpeed with PyTorch Lightning

## Integrating DeepSpeed with PyTorch Lightning Integrating DeepSpeed with PyTorch Lightning can significantly enhance training efficiency and scalability, especially for large models and distribut…

martinwholtmon updated 5 months ago
1
pixie-io/pixie #2040

Dynamically probe system for availability of larger (1M) BPF…

**Describe the bug** In https://github.com/pixie-io/pixie/pull/1795, we introduced the ability for kernels that support the 1M BPF program limit to raise certain tunables used to restrict program siz…

ddelnano updated 1 month ago
1
gammapy/gammapy #4786

Implement a split and merge scheme for Map

For distributing large computations but also application such as position dependent PSF kernels it would really useful to have a split and merge scheme for `Map` objects. There are many thing to be c…

adonath updated 1 month ago
2
SPECFEM/specfem3d #1146

another idea to improve adjoint run speed is to merge the GP…

From Etienne @EtienneBachmann : Another idea to improve adjoint run speed is to merge the GPU kernels in the compute_kernels routine, where rho kernels and other kernels are separated. It should n…

komatits updated 6 years ago
1
nebari-dev/nebari #2710

[ENH] - Better way to use local packages on nebari

### Feature description I find myself limited by nebari with respect to working on larger analyses (multiple notebooks spread across directory tree). Locally, I would either: - start JupyterLab with…

krassowski updated 2 months ago
4
microsoft/DeepSpeed #1833

Is "High-performance INT8 inference kernels" released?

https://www.microsoft.com/en-us/research/blog/deepspeed-accelerating-large-scale-model-inference-and-training-via-system-optimizations-and-compression/ >High-performance INT8 inference kernels are …

wanghaoshuang updated 2 years ago
3

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for large-kernels

1000+ results
for large-kernels