-
Good thing I kept all my research work private, already deep q networks code stolen.
Feel free to contact me if needed in cloudsim scheduling and energy part, I have worked on reinforcement learnin…
-
win10 node 22.1.0 go 123.3
when I run
go run main.go
the following error is raised
..\..\sql\setup.go:17:2: no required module provides package github.com/delaneyj/realworld-datastar/sql/zz; to ad…
-
CUDA_VISIBLE_DEVICES=6,7 torchrun --nproc_per_node 2 \
-m FlagEmbedding.llm_reranker.finetune_for_layerwise.run \
--output_dir ./results/reranker/bge-reranker-v2-minicpm-layerwise \
--model_name_or…
-
Hello,
I would like to know what you think about having some standalone implementations as functions that take in the environment and other parameters and return the trained policy.
Here an examp…
-
# Deep Q-Network (DQN) on LunarLander-v2 | Chan`s Jupyter
In this post, We will take a hands-on-lab of Simple Deep Q-Network (DQN) on openAI LunarLander-v2 environment. This is the coding exercise fr…
-
Excuse me! What does Q-V Learning mean? The algorithm of `Q_V_Garbage.m` is more like the combination of TD(0) for evaluating v_pi and Sarsa control methods rather than the Q-learning method. Can you …
-
Hello, partial issue, partial question.
When configuring my LSP, I noticed that the below configurations show their **default** values, but do **not** show other valid values (variants).
I believe t…
-
# Title of the Talk: No Code SLM Finetuning with MonsterAPI
## Abstract of the Talk:
Dive into the world of no-code large language model (LLM) finetuning in this informative talk presented by Mons…
-
I see that you are using a 0 vector for the rewards, and only updating the value that corresponds to the action here:
https://github.com/AxiomaticUncertainty/Deep-Q-Learning-for-Tic-Tac-Toe/blob/c5c0…
-
Hi, first of all thanks for the great tool. I am still learning how to use it, so apologies if any of this is trivial.
Inspecting the MULTIFACED_CARDS.txt file shows it (1) is outdated and (2) con…
gfrt0 updated
5 months ago