-
> 使用的oneflow版本:0.8.0+cu102,使用的libai版本:最新commit
目前正在尝试利用oneflow-libai跑gpt-2,按照[tutorial](https://libai.readthedocs.io/en/latest/tutorials/get_started/quick_run.html)的指示,仅修改了dataset相关的配置信息,运行`bash to…
-
https://arxiv.org/pdf/2110.08450.pdf
-
Hi, I am Gordon Lee.
Sorry to bother you with this issue.
Thanks for your excellent work on sematic-retrieval models.
Recently, MLNLP and I have made a search tool to collect top-tier conference up…
-
Hi there,
We're a [MLSys group](https://guanh01.github.io/) working on multi-task learning at the University of Massachusetts Amherst. Recently we have some new works completed. Would you mind taki…
-
Before open-sourcing the nnsmith project, I want to simplify and standardize a bit the development and user accessibility. Below is a tracking list of TODOs for @ganler to make the repository a better…
-
Technically speaking, `.view()` should not actually trigger a memory reformatting (at least for GPUs; XLA is relatively unclear as it brings in extra complexity); therefore, the tensors that contain t…
-
Are there any publications/docs explaining the motivation behind
using `jensen_shannon_divergence` / `infinity_norm` for the Data Drift / Training-Serving Skew detection?
Since there are many approa…
-
https://proceedings.mlsys.org/paper/2021/file/757b505cfd34c64c85ca5b5690ee5293-Paper.pdf
感觉应该发ICLR.
-
Firstly, greatly thanks for sharing you brilliant work! After reading the R2B paper, I got little confused about the data-driven rescaling.
For BNN, one of the most significant benefit is to use on…
-
PDF: [https://research.fb.com/wp-content/uploads/2021/04/CPR-Understanding-and-Improving-Failure-Tolerant-Training-for-Deep-Learning-Recommendation-with-Partial-Recovery.pdf](https://research.fb.com/w…