-
After the release of tensorflow 2.0, there are several enhancements that has been made on both of the versions. Some functions are taken out of 1.x and some are deprecated and replaced in tensorflow 2…
-
## Overview
As part of the information architecture refactor, we need to do a little bit more follow up on the new learner overview/report page - `ReportsLearnerReportPage`.
Please review https://…
-
Hi, thank you very much for sharing the code. It is very helpful.
I have a question about the meaning of constant "3". In many places of the codes, "3" is directly used to define the parameters. s…
-
often meets CUDA out of memory in the stage of evaluating the model (which periodically called after 1500 iterations of training).
In motion_lib_real.py line 199 we load the motions in memory and …
-
### Dependency
- #60
### Overview
We need to create epic issues for each project and CoP so that we can manage their marketing issues
#### Details
Management includes
- Identifying missing is…
-
Hi!
Let's bring the reinforcement learning course to all the Russian-speaking community 🌏
Would you want to translate? Please follow the 🤗 [TRANSLATING guide](https://github.com/huggingface/tran…
-
import logging
import os
import json
import torch
from datasets import load_from_disk
from transformers import TrainingArguments
from trl import SFTTrainer
from unsloth import FastLanguageModel…
-
Hi all, I am trying to fine-tune models in extremely long contexts.
I've tested the training setup below, and I managed to finetune:
- llama3.1-1B with a max_sequence_length of 128 * 1024 tokens
…
-
Cool (learning?) project, I guess!
## Underlying problem
A little note I've found: As far as I know/see, passwords are not being hashed, are they?
https://github.com/search?q=repo%3Ashanirub%…
rugk updated
2 months ago
-
@enricoande
[1] https://github.com/enricoande/reinforcement_learning_examples/blob/95627db2a323535153e711a23f5519ecf7409f38/invertedpendulum/Sarsa/episodeFA.m#L35
It appears that here `phi` cor…