-
In step 2.5, it is asked to [enable the "managed" Aspect](https://github.com/sap-tutorials/sap-build-apps/blob/main/tutorials/build-apps-cap-service/4-create-relationship4.jpg).
This cannot be don…
-
Hola! Tengo una duda respecto a lo que deberíamos esperar como convergencia de nuestro algoritmo Q-learning, ya que por como lo entiendo, nuestro agente debe buscar la mayor recompensa posible, que e…
-
`RANK=8
deepspeed --num_gpus=8 --num_nodes=2 train.py \
--base_model --micro_batch_size 4\
--wandb_run_name mora_math_r8 --lora_target_modules q_proj,k_proj,v_proj,o_proj,ga…
-
**Is your feature request related to a problem? Please describe.**
I have a feeling that the reduction used for computing the TD error over ensembles of Q functions should be a mean rather than a sum…
-
WELL DONE! This tutorial is amaze, the first thing that stood out to me was the topic- military coups is something I never would have thought of and so interesting!
What went well:
- Very informat…
-
This seems to be a conceptual issue. In the pacman example the e-greedy policy is annealed over time.
If the network is run for more than a few hours, epsilon eventually goes to 0 and the distributi…
neale updated
7 years ago
-
Quark is a comprehensive cross-platform toolkit designed to simplify and enhance the quantization of deep learning models. Supporting both PyTorch and ONNX models, Quark empowers developers to optimiz…
-
This will be a precursor to the machine learning model we will use for detecting jammers and jammed signals.
For now, it will consist on a simple "on" or "off" sequence where the ML model will learn…
-
Look into papers related to what we are doing such as, AI, mnemonics, language learning, flash cards. Especially the topic of efficient language learning can be interesting and what makes vocabulary s…
-
I tried testing the code and it crashes after the 500 episodes of training from the last part.
For `state = stack_frames(stacked_frames, frame)` did you forget a True or False at the end? And possibl…