-
For a model with adjacent weight tying, as in section 2.2.1, the gradient goes to NaN after a while.
The model is designed to work in bAbI (1k dataset). I tried lowering the learning rate to 1e-5 fr…
-
Hi nicolas,
first really thanks for your work. when I run your code, I cannot get meaningful results, all I got is like
```
NFO:lib.nn_model.train:[why ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as…
-
### Description
I am trying to run a MAUI app on my M1 MacBook Pro. I don't have Rosetta 2 installed and when running `dotnet new maui && dotnet build -t:Run -f net6.0-maccatalyst` I get the follow…
-
**[The following two posts are my reply to /u/starspawn0's comment on the paper [*Language Modeling for Formal Mathematics*](https://arxiv.org/abs/2006.04757) by Christian Szegedy et al., posted on su…
-
-
### Description
I am trying to run a MAUI app on my M1 MacBook Pro. I don't have Rosetta 2 installed and when running `dotnet new maui && dotnet build -t:Run -f net6.0-maccatalyst` I get the followin…
-
Hello,
I try to understand how you preprocessed the CodRep datasets for sequencer and I don't understand several things :
- I've cloned the CodRep competition repository and I don't have the sam…
-
I want to store the answers from multiple models on multiple tasks to do Knowledge Distillation, but I'm broke and I can't afford to run them. I was thinking of using data from this project.
Can we a…
-
Hi,
We've talked about your input pipeline in StackOverflow and since you advised me to open an issue here, here I am. You've helped me a lot already but I would like to know more about the actual …
-
Do Palm models really ‘understand’ the language? Especially the logical reasoning aspect?
Folks at [Facebook AI Research](https://engineering.fb.com/category/ai-research/) came up with a set of see…