-
Hello, can this distillation model be used for time series models, the dataset I want to process is related to weather prediction, can this be used
-
### Model/Pipeline/Scheduler description
ConsistencyTTA, introduced in the paper [_Accelerating Diffusion-Based Text-to-Audio Generation
with Consistency Distillation_](https://arxiv.org/abs/2309.…
-
Currently `rf.BatchNorm` decides whether to update the running statistics based on the `rf.get_run_ctx().train_flag` as in [this line](https://github.com/rwth-i6/returnn/blob/master/returnn/frontend/n…
-
首先,感谢internvideo小组出色的工作。
@yinanhe
从 [readme](https://github.com/OpenGVLab/InternVideo/blob/main/InternVideo2/multi_modality/MODEL_ZOO.md) 中可以看到InternVideo2-CIP-S14/B14等小模型的下载链接,但似乎模型只有十几M的大小,好像…
-
### Feature request / 功能建议
This feature request proposes adding support for Meta's newly released Llama 3.2 models to lmdeploy. Llama 3.2 introduces exciting capabilities, including vision LLMs (11…
-
Hello,
Thank you for your comprehensive and wonderful survey.
Would you mind adding 2 papers about text summarization?
Paper 1: Enriching and Controlling Global Semantics for Text Summarizati…
-
An experiment for #231
da-en is one of our best models from the spring-2024 run. The teacher ensemble had a COMET score of 0.9013. The student COMET was 0.8950, with a tiny -0.0063 gap. In order to…
-
Hello! I am training the first two knowledge distillation stages of Mamba 2 on one DGX-H100x8 node, and I am experiencing train times of ~8 hours for the first stage, and ~13 hours for the second stag…
-
when you train LCM_svd, you set svd_solver like,
svd_solver = SVDSolver(args.N, noise_scheduler.config.sigma_min, noise_scheduler.config.sigma_max, 7,0.7, 1.6)
why you change training timestep t…
-
### Description
To be completed in the December release:
- [ ] Ultraviolet Advanced Oxidation Process @luohezhiming
- [ ] Dewatering Unit @adam-a-a
- [ ] Electrodialysis 0D & 1D @lbibl
- [ ]…