q-learning Search Results

1000+ results
for q-learning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

aimacode/aima-java #203

AIMA4e reimplement Fig 21.8 Q-Learning-Agent from AIMA3e bra…

ctjoreilly updated 6 years ago
8
HaysonC/NEAT-PongBot #1

Fine tuning DQN

Completely no idea what is wrong, check the reward and Q function graph. Sometimes you stumble upon a functional agent that moves well or seem to chase the ball, but it is highly unstable. https:/…

supreme-gg-gg updated 2 weeks ago
4
unslothai/unsloth #1101

Getting CUDA OOM on training gemma-2-2b with "lm_head" and "…

Hi @danielhanchen I am trying to fine-tune gemma2-2b for my task following the guidelines of the continued finetuning in unsloth. Howver, I am facing OOM while doing so. My intent is to train gemm…

InderjeetVishnoi updated 1 month ago
6
alireza-montazeri/AV-Nash-Q-Learning #1

Two small problems about the procedure

Thank you very much for your outstanding work. I have a few small questions that I want to confirm with you. Firstly, in the `my_highway_env.py` file, `vehicle = self.action_type.vehicle_class`…

ShenZC25 updated 9 months ago
1
Niketkumardheeryan/ML-CaPsule #1142

Waste Management through Reinforcement Learning

The project aims to develop a reinforcement learning (RL) agent to optimize waste collection in a simulated environment, minimizing overflow events and improving efficiency. Environment and State R…

Panchadip-128 updated 1 month ago
2
dxyang/DQN_pytorch #7

How can I deal with it?

Traceback (most recent call last): File "main.py", line 136, in main() File "main.py", line 132, in main atari_learn(env, task.env_id, num_timesteps=task.max_timesteps, double_dqn=dou…

liyuxiang1111 updated 2 weeks ago
1
OpenRLHF/OpenRLHF #501

assert state_dict_keys.issubset( [rank0]: AssertionError: mi…

在用qlora在两张32GbV100上微调Llama-3___2-3B-Instruct时最后保存模型的时候报错 slurm脚本为 ```#!/bin/bash #SBATCH --job-name=openrlhf #SBATCH --partition=gpu_v100 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH…

anoxia-1 updated 3 weeks ago
1
qiskit-community/qiskit-camp-africa-19 #43

Quantum speed up of a Q-means algorithm for unsupervised mac…

# Abstract This project implements the NeurIPS 2019 paper: q-means: A quantum algorithm for unsupervised machine learning https://papers.nips.cc/paper/8667-q-means-a-quantum-algorithm-for-unsupervi…

waheeda-saib updated 4 years ago
13
eosphoros-ai/DB-GPT-Hub #293

使用Qwen2___5-Coder-7B-Instruct进行微调，参数如下，出现如下报错，求助求助！！！！

CUDA_VISIBLE_DEVICES=0 python /home/ubuntu/TextToSQL/DB-GPT-Hub/src/dbgpt-hub-sql/dbgpt_hub_sql/train/sft_train.py\ --model_name_or_path /home/ubuntu/.cache/modelscope/hub/qwen/Qwen2___5-Coder-7B…

ychuest updated 1 month ago
9
hpi-epic/BP2021 #447

[Configuration] Change the default RL-Agent to something bet…

By changing the default agent class in the various `default_files`, we can "make" more people use better RL-agents than QLearning

NikkelM updated 2 years ago
1

上一页 1...9 10 11 12 13 14 15...100 下一页

1000+ results for q-learning

1000+ results
for q-learning