-
Hi, It seems that the same code is **working fine with when the Megatron-LM that I git-cloned in April. With the latest Megatron-LM, I've got the following error raised with the pretrain_gpt.py code. …
-
### Metadata
Authors: Marek Rei and Anders Søgaard
Organization: University of Cambridge & University of Copenhagen
Conference: NAACL 2018
Paper: https://arxiv.org/pdf/1805.02214.pdf
Code: https:…
-
> … [programmers might finally have the decency to pay attention to the document formats that the other 99% of the human race prefers](https://github.com/swcarpentry/modern-scientific-authoring/blob/4…
wking updated
8 years ago
-
I am running the pretraining code the way you suggested but it has been stuck at this point for 2 hours now. Is this supposed to take this long?
```console
neilpaul77@NeilRig77:~/Downloads/ntua-slp-…
-
@alasdairtran Hi, I have read your newly published paper. I'm curious about how LSTM+Glove+IA encodes articles? Does it encode each article at the sentence level or the word level?
-
hi
I am a researcher studying EEG-To-Text. I recently saw your Neuspeech paper. I was impressed by your paper, and it was a great help to my research direction. thanks. But I have some kinds of quest…
-
I am using `intfloat/e5-mistral-7b-instruct` model to get last hidden state for my input and compute cosine similarity.
I am using a toy example provided at: https://huggingface.co/intfloat/e5-mist…
-
This is maybe a trivial question but I'm completely new to torch, I tried to search on Google but no luck. I'm working with a Ubuntu 14.04 machine, cuda 7.0 and cudnn R4 version. I prepared all traini…
-
Thanks for the great code. I encountered an issue when using the GroundDINO (or maybe it is just expected?)
If I use a long word, like 'pottedplant', it will be tokenized into several sub-words.
wh…
-
Let's write down some of our takeaways from "Attention is All You Need" and then one of us can collate them into a single document to put into this repo so that we can remind ourselves when we forget.…