-
Looking into standard preprocessing pipelines, it would be good to add common average re-referencing.
See article here: https://eeglab.org/tutorials/ConceptsGuide/rereferencing_background.html
…
-
**Description**
I would like to shard one large LLM model across multiple GPUs, but Triton wants to load separate copies of the model onto each GPU, which result in OOM.
**Triton Information**
Wh…
-
我往lprnet网络结构的顶端添加了一个stn,那个空间变换网络,但是似乎这个新结构训练难度很大,loss不下降。有人知道怎么做吗?
i added a STN network(the spacial transforming network) to the top of LPRNet, but i find training this new structure quite difficu…
-
## ❓ Questions and Help
Hi all,
I am using the pre-trained transformer for my project. I follow ['reproduce ende-wmt14'](https://github.com/facebookresearch/fairseq/issues/346) to train the tran…
-
Hi, we're using the litgpt framework to train models and then would like to export them to huggingface format for continued tuning and evaluation.
The steps we're using after completing training ar…
-
Hi,
by reading the UDOP paper, my understanding is that during pre-training the model is taught to predict the layout of a target (textual) sequence using special layout tokens.
I was wondering …
-
**Describe the bug**
Installation error while on the first step of the tutorial using Studio Lab (https://github.com/aws/studio-lab-examples/blob/main/connect-to-aws/Access_AWS_from_Studio_Lab_Deploy…
-
Hello, first of all, thank you for your open source training, which is very important for many of us developers.
At present, I want to use my personal small dataset for fine-tuning under the origin…
-
Hi, Dear author:
It seems the llava-next is really insightful exploreing work. Please kindly release the training and inference code asap, thank you very much.
-
Argilla integration, dataset integration etc.
detail to follow.
pngwn updated
1 month ago