-
## Summary
MVS can be considered as an improved version of the Gradient-based One-Side Sampling (GOSS, see details in the paper) implemented in LightGBM, which samples a given number of top exa…
-
I trained an inpainting model which has `torch.rfftn` / `torch.irfftn` modules and accepts image data with shape-[b, 4, h, w]. For some reason the `torch.onnx.export` can't export operators with compl…
-
**Environment:**
1. Framework: (TensorFlow, Keras, PyTorch, MXNet)
2. Framework version:
3. Horovod version:
4. MPI version:
5. CUDA version:
6. NCCL version:
7. Python version:
8. OS and vers…
-
-
We should re-implement the OX-SPLIT crossover operator, which is originally used by HGS. https://github.com/N-Wouda/Euro-NeurIPS-2022/commit/2c495c4438b6ae24c38bdbb39a5eff6f8bf6d2d9 is the relevant co…
-
Typst seems like a good potential choice for people in the legal community to be able to write court filings. However, some courts require that legal documents include line numbers in the left-hand co…
-
Hi,
Thanks for your awesome repo! We are a deep learning and optimization group, and we have two papers on multi-task learning.
- Direction-oriented Multi-objective Learning: Simple and Provable…
-
co-authored with @iffsid
Statistically, batching allows us to trade off the speed of taking gradient steps for lower variance gradient estimators. This can be done in a for loop. However, batching…
-
Currently, the library expects the dataloader to provide / the model to consume `(x, y)` pairs. This isn't appropriate for, e.g., autoregressive tasks like language modeling.
See, for example, Hu…
-
Something that can measure how well a LLM can deal with tools. CoT already kind of goes that way, but not really since it's limited to a bit of mathematical reasoning and not really tool usage.