PacktPublishing / Deep-Reinforcement-Learning-Hands-On-Second-Edition

Deep-Reinforcement-Learning-Hands-On-Second-Edition, published by Packt
MIT License
1.14k stars 535 forks source link

Chapter 9 ignite.engine dependency issue #6

Closed HyoungsungKim closed 4 years ago

HyoungsungKim commented 4 years ago

I downloaded pytorch-ignite using pip based on pytorch-ignite github

However i got error

$ python3 baseline.py --cuda
Traceback (most recent call last):
  File "baseline.py", line 98, in <module>
    engine.run(batch_generator(buffer, params.replay_initial, params.batch_size))
  File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 590, in run
    raise ValueError("Argument `epoch_length` should be defined if `data` is an iterator")
ValueError: Argument `epoch_length` should be defined if `data` is an iterator

After i comment line 589 and 590 of engine.py of ignite, baseline.py works

Which version of ignite i have to use?

I tried pytorch-ignite 0.3.0 and 0.4.0 but both didn't work without commenting few lines of engine.py

In 0.4.0 i commented line 589 and 590 and in 0.3.0 i commented line 834 and 835

vfdev-5 commented 4 years ago

@HyoungsungKim according to requirements : https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On-Second-Edition/blob/master/requirements.txt Ignite should be 0.2.1

Shmuma commented 4 years ago

Yep, they are baking new versions at the speed of light :)

vfdev-5 commented 4 years ago

Well, our speed of light is about 1 release per 4-6 months. 0.4.0 is nightly and 0.3.0 is the stable one with some BC changes.

Shmuma commented 4 years ago

@vfdev-5 still too fast in comparison to book writing + implementing all the examples and make them working.

For the first edition (which took 6 month), I've updated PyTorch version twice to make it more or lest the latest at the time of book publish. With the second edition (work on which started last Jun), PyTorch and Ignite versions was updated only once, but still, rechecking all the examples on GPU takes lots of time.

HyoungsungKim commented 4 years ago

Thank you for answering @vfdev-5. I had to read requirement more carefully.

@Shmuma If you don't mind, I want to ask something to study this book.

I have studied reinforcement learning with coursera and sutton's book.

Currently i am studying this book to build my own agent and progressing chapter 9.

This book is very good and i really thank you for awesome contents.

But since using ptan library, i have started worrying that i can implement my own agent after finishing this book.

In book, you mentioned that purpose of this book is developing intuition for RL not studying how to use library.

However, I feel i am studying how to use library. maybe i lost a way to develop intuition.

If you don't mind, could advise me how to study this book?

Before using ptan, i really satisfied because i typed all of codes in repositories and it was very helpful to understand how agent works .

But as i referred, after using ptan i got a worry that i can implement my own agent.

Cause I think ptan is good to understand what agent do. But i cannot understand how mathematical expression and RL's tweaks are implemented in python.

I tried to type all of codes of ptan and other libraries that you implemented in repositories.

But i don't know it is a right way or not. sometimes i feel like wasting time for studying unimportant things.

Because i don't know what is important and not.(or worrying too much)

Do you recommend to type all of source codes or recommend other way?

Thank you again for awesome contents and have a nice week :)

Shmuma commented 4 years ago

@HyoungsungKim The main point of ptan was to avoid remplementing the same code again and again (it is explained in chapter 7). I haven't counted, but I think there are at least 50 training loops in all the book's examples. Which means, if they would be implemented from scratch (without 3rd party libraries), it will be 50 replay buffers and code pieces to track states and so on and so forth. At the same time, I'm not very satisfied with existing high-level RL libs available, which, from my perspective are not fully meet the educational purpose of the book. So, ptan is kept really minimalistic, something you can reimplement yourself in 3-4 days. That is one of the reason why ptan is not properly maintained as general RL library - no docs, no bells and whistles, only essential functionality. Originally, I have no intention to make general-purpose RL library everybody should enjoy using. At the same time it saves lots of time and code when you know how it works, but you're not forced to use it -- it just saves your time from writing replay buffer yet another time.

Returning to the question how to study the book. I think, retyping all the code might be too extreme and not very efficient. I'd suggest you try to implement some examples from the book on your own, maybe, designing your own version of ptan or not using other libs at all. This will inevitably will lead to errors in your code, some of them will be hard to find, but during the debugging, you'll learn a lot about the method and the code logic, rather than from retyping. In addition, you might use book's code as a reference point to compare the speed, convergence and the policy produced.

This method might be not universal and quite time-consuming, but from my experience this is the best way to get real understanding what's going on :). In fact, me personally learned a lot during book's examples preparation. It wasn't always was nice and smooth experience, but it was really helpful.

Good luck with your studying!

HyoungsungKim commented 4 years ago

@Shmuma Thanks a lot for your help! I hope someday i can mention this book in a good paper :)

ugurkanates commented 4 years ago

Thanks for bug error fix