q-learning Search Results

1000+ results
for q-learning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

sap-tutorials/sap-build-apps #623

Minor update to "Create a CAP Service with BAS Productivity …

In step 2.5, it is asked to [enable the "managed" Aspect](https://github.com/sap-tutorials/sap-build-apps/blob/main/tutorials/build-apps-cap-service/4-create-relationship4.jpg). This cannot be don…

goncalvesp updated 2 weeks ago
1
IIC2613-Inteligencia-Artificial-2024-2/Syllabus #90

Tarea 5, parte 2, generación de celdas de Flappy Bidoof

Hola! Tengo una duda respecto a lo que deberíamos esperar como convergencia de nuestro algoritmo Q-learning, ya que por como lo entiendo, nuestro agente debe buscar la mayor recompensa posible, que e…

chocolito24 updated 5 days ago
2
kongds/MoRA #20

数据集格式

`RANK=8 deepspeed --num_gpus=8 --num_nodes=2 train.py \ --base_model --micro_batch_size 4\ --wandb_run_name mora_math_r8 --lora_target_modules q_proj,k_proj,v_proj,o_proj,ga…

lcykww updated 3 weeks ago
1
takuseno/d3rlpy #421

Alteration to default summation used to define TD loss of en…

**Is your feature request related to a problem? Please describe.** I have a feeling that the reduction used for computing the TD error over ensembles of Q functions should be a mean rather than a sum…

joshuaspear updated 1 month ago
1
EdDataScienceEES/tutorial-shaistilman #2

Feedback on Tutorial

WELL DONE! This tutorial is amaze, the first thing that stood out to me was the topic- military coups is something I never would have thought of and so interesting! What went well: - Very informat…

AmeliaYoung updated 2 days ago
1
tflearn/tflearn #634

Epsilon truncates learning in 1 Step Q runner

This seems to be a conceptual issue. In the pacman example the e-greedy policy is annealed over time. If the network is run for more than a few hours, epsilon eventually goes to 0 and the distributi…

neale updated 7 years ago
2
vllm-project/vllm #10294

[Feature]: Quark quantization format upstream to VLLM

Quark is a comprehensive cross-platform toolkit designed to simplify and enhance the quantization of deep learning models. Supporting both PyTorch and ONNX models, Quark empowers developers to optimiz…

kewang-xlnx updated 5 days ago
3
C-V2X-Senior-Design/TrackTasks #8

Simple Q-Learning / Classification Model for Signal Detectio…

This will be a precursor to the machine learning model we will use for detecting jammers and jammed signals. For now, it will consist on a simple "on" or "off" sequence where the ML model will learn…

jasoninirio updated 2 years ago
3
StephanAkkerman/FluentAI #12

Research related papers

Look into papers related to what we are doing such as, AI, mnemonics, language learning, flash cards. Especially the topic of efficient language learning can be interesting and what makes vocabulary s…

StephanAkkerman updated 2 weeks ago
3
simoninithomas/Deep_reinforcement_learning_Course #22

Deep Q learning with Doom, last part

I tried testing the code and it crashes after the 500 episodes of training from the last part. For `state = stack_frames(stacked_frames, frame)` did you forget a True or False at the end? And possibl…

meguvegu updated 5 years ago
1

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for q-learning

1000+ results
for q-learning