yandexdataschool / Practical_RL

A course in reinforcement learning in the wild
The Unlicense
5.89k stars 1.69k forks source link

Port changes from master to coursera #262

Closed dniku closed 4 years ago

dniku commented 5 years ago

I have just discovered that we did not have an issue for this thing we've been working on for a while now, so I'll create one to document progress.

Also:

Here is a convenient CSV that matches files in master and coursera:

week01_intro/crossentropy_method.ipynb,week1_intro/crossentropy_method.ipynb
week01_intro/deep_crossentropy_method.ipynb,week1_intro/deep_crossentropy_method.ipynb
week01_intro/pong.py,
week01_intro/primer_python_for_ml/recap_ml.ipynb,week1_intro/primer/recap_ml.ipynb
week01_intro/primer_python_for_ml/train.csv,week1_intro/primer/train.csv
week01_intro/project_starter_evolution_strategies.ipynb,
week01_intro/seminar_gym_interface.ipynb,week1_intro/gym_interface.ipynb
week02_value_based/mdp.py,week2_model_based/mdp.py
week02_value_based/seminar_vi.ipynb,week2_model_based/practice_vi.ipynb
week03_model_free/homework.ipynb,week3_model_free/sarsa.ipynb;week3_model_free/experience_replay.ipynb
week03_model_free/seminar_qlearning.ipynb,week3_model_free/qlearning.ipynb
week04_[recap]_deep_learning/fix_my_nn.ipynb,
week04_[recap]_deep_learning/mnist.py,week1_intro/primer/mnist.py
week04_[recap]_deep_learning/notmnist.py,
week04_[recap]_deep_learning/practice_lasagne.ipynb,
week04_[recap]_deep_learning/seminar_tensorflow.ipynb,week1_intro/primer/recap_tensorflow.ipynb
week04_[recap]_deep_learning/seminar_pytorch.ipynb,week1_intro/primer/recap_pytorch.ipynb
week04_[recap]_deep_learning/seminar_tensorflow.ipynb,
week04_approx_rl/atari_wrappers.py,week4_approx/atari_wrappers.py
week04_approx_rl/framebuffer.py,week4_approx/framebuffer.py
week04_approx_rl/homework_lasagne.ipynb,
week04_approx_rl/homework_pytorch_debug.ipynb,
week04_approx_rl/homework_pytorch_main.ipynb,week4_approx/dqn_atari_pytorch.ipynb
week04_approx_rl/homework_tf.ipynb,week4_approx/dqn_atari.ipynb
week04_approx_rl/replay_buffer.py,week4_approx/replay_buffer.py
week04_approx_rl/seminar_lasagne.ipynb,
week04_approx_rl/seminar_pytorch.ipynb,week4_approx/practice_approx_qlearning_pytorch.ipynb
week04_approx_rl/seminar_tf.ipynb,week4_approx/practice_approx_qlearning.ipynb
week04_approx_rl/utils.py,week4_approx/utils.py
week05_explore/bayes.py,
week05_explore/week5.ipynb,week6_outro/bandits.ipynb
week06_policy_based/a2c-optional.ipynb,
week06_policy_based/atari_wrappers.py,
week06_policy_based/env_batch.py,
week06_policy_based/reinforce_lasagne.ipynb,
week06_policy_based/reinforce_pytorch.ipynb,
week06_policy_based/reinforce_tensorflow.ipynb,week5_policy_based/practice_reinforce.ipynb
week06_policy_based/runners.py,
week07_[recap]_rnn/seminar_lasagne.ipynb,
week07_[recap]_rnn/seminar_lasagne_ingraph.ipynb,
week07_[recap]_rnn/seminar_pytorch.ipynb,
week07_[recap]_rnn/seminar_tf.ipynb,
week07_seq2seq/basic_model_tf.py,week6_outro/seq2seq/basic_model_tf.py
week07_seq2seq/basic_model_theano.py,
week07_seq2seq/basic_model_torch.py,
week07_seq2seq/bonus.ipynb,
week07_seq2seq/practice_tf.ipynb,week6_outro/seq2seq/practice_tf.ipynb
week07_seq2seq/practice_theano.ipynb,
week07_seq2seq/practice_torch.ipynb,
week07_seq2seq/voc.py,week6_outro/seq2seq/voc.py
week08_pomdp/atari_util.py,week5_policy_based/atari_util.py
week08_pomdp/env_pool.py,
week08_pomdp/homework_common_part2.ipynb,
week08_pomdp/practice_pytorch.ipynb,
week08_pomdp/practice_tensorflow.ipynb,week5_policy_based/practice_a3c.ipynb
week08_pomdp/practice_theano.ipynb,
week08_pomdp/theano_optional_recurrence_tutorial.ipynb,
week09_policy_II/mujoco_wrappers.py,
week09_policy_II/ppo.ipynb,
week09_policy_II/runners.py,
week09_policy_II/seminar_TRPO_pytorch.ipynb,
week09_policy_II/seminar_TRPO_tensorflow.ipynb,
week09_policy_II/seminar_TRPO_theano.ipynb,
week10_planning/seminar_MCTS.ipynb,week6_outro/practice_mcts.ipynb

I recommend to alias column -s, -t as csv in your shell to display CSVs.

dniku commented 4 years ago

Closed in #353.