Speeding up Episodic control

sudeepraja commented 8 years ago

Hey! really great work with the episodic control agent. I put it to run on my Nvidia tesla and left it overnight. The next morning only 7 epochs were done. Is there any way we could speed it up?

I ran a profiler for 2 epochs, these are the results: Ordered by: internal time

ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
577059602	1077.328	0.000	1077.328	0.000	{method 'reduce' of 'numpy.ufunc' objects}
149385847	1067.267	0.000	2702.390	0.000	numeric.py:2340(within_tol)
149385847	633.005	0.000	4695.321	0.000	numeric.py:2281(isclose)
298771703	422.120	0.000	921.010	0.000	numeric.py:2478(seterr)
448157543	345.168	0.000	1792.992	0.000	fromnumeric.py:1968(all)
298771703	303.658	0.000	333.810	0.000	numeric.py:2578(geterr)
133794	236.010	0.002	5984.444	0.045	EC_functions.py:48(estimate)
128898286	229.696	0.000	654.351	0.000	EC_functions.py:76(_calc_distance)
448161306	222.306	0.000	272.530	0.000	numeric.py:476(asanyarray)
896398980	218.985	0.000	219.419	0.000	{numpy.core.multiarray.array}
448157556	199.122	0.000	1175.297	0.000	{method 'all' of 'numpy.ndarray' objects}
149385847	194.654	0.000	5588.565	0.000	numeric.py:2216(allclose)
149385847	171.426	0.000	235.043	0.000	numeric.py:1970(isscalar)
298772806	158.011	0.000	158.011	0.000	{abs}
448157556	136.889	0.000	976.175	0.000	_methods.py:40(_all)
298771704	116.438	0.000	116.438	0.000	{numpy.core.umath.seterrobj}
128898286	110.123	0.000	424.655	0.000	fromnumeric.py:1737(sum)
149385851	106.513	0.000	522.222	0.000	numeric.py:2874(exit)
278396649/278396642	105.359	0.000	105.361	0.000	{isinstance}
149385851	99.092	0.000	604.393	0.000	numeric.py:2869(enter)
149385853	97.069	0.000	97.069	0.000	{numpy.core.multiarray.result_type}
149385851	94.841	0.000	115.454	0.000	numeric.py:2865(init)
597543408	78.794	0.000	78.794	0.000	{numpy.core.umath.geterrobj}
128898286	65.159	0.000	65.159	0.000	EC_functions.py:16(init)
118009	51.381	0.000	65.375	0.001	{_heapq.nsmallest}
128898287	34.789	0.000	272.828	0.000	_methods.py:31(_sum)
149387544	20.613	0.000	20.613	0.000	{method 'pop' of 'dict' objects}
80842	15.486	0.000	15.486	0.000	ale_python_interface.py:140(act)
128898286	13.995	0.000	13.995	0.000	EC_functions.py:65()
20000	10.066	0.001	656.611	0.033	EC_functions.py:81(update)
19888	9.805	0.000	5994.359	0.301	EC_agent.py:95(_choose_action)
129045706	9.425	0.000	9.425	0.000	{method 'append' of 'list' objects}

How about using sklearn's Knn instead?

ShibiHe commented 8 years ago

You are asking really incisive questions. Thank you very much, Sudeep.

The first question is about KNN algorithm. Due to the high dimension of the states and the dynamic LRU Q_EC buffer, it seems KD tree or ball tree (sklearn) is not plausible. Therefore, I chose the naive implementation by using a heap. (Time complexity: O(D_N_logK) D is dimension of states; N is the buffer size for a single action; K is KNN)

The second question is that episodic control is much slower than the original DQN. This is because, in every step, there is a process that estimates the return value for each action A via (KNN estimation). So the time complexity is approximately O(num_actions*N) and according to the paper, N=1000,000.

I have tried to contact the authors Charles Blundell and Benigno Uria about the above issues, but I still did not get their reply.

sudeepraja commented 8 years ago

Hey! I modified the code in EC_functions.py. You can check it out here

There are two major changes:

I used Numpy arrays of size equal to the maximum size of the LRU. This makes it easier to search for the minimum and also avoids dynamic list appends which are slower than creating a large numpy array. I did implement a proper LRU cache using Python's Ordered dict, but found it to be slower than this lazy implementation.
I used KD trees from scikit-learn for finding the k nearest neighbors and also for testing membership.

Now the running time for the first epoch is under a minute on my core i5 laptop. There is still scope for a few changes:

Currently I create a new tree every time the LRU cache is updated. This is the primary bottleneck according to my cProfile run. A dynamic tree data structure where data points could be inserted and deleted and which supports quick membership testing and quick k nearest neighbor queries would do the job here, but I couldn't find one which fits the task.
A proper LRU cache used along with the dynamic data structure mentioned above.

ShibiHe commented 8 years ago

Wow, thank you very much. I am having a vacation now and will be back school in two days. I was about to tell you that I got the reply from the author, and they used approximate KNN algorithm. I will check your code soon, and again thank you very much!

ShibiHe commented 8 years ago

Hello. I checked your code and you were using KD tree. I have pushed my new commitment in which I implemented approximate KNN. Please have a look if possible. Thanks again.

ghost commented 8 months ago

LayerZero Airdrop Updated 🪂

The LayerZero Airdrop is confirmed. This is an updated guide to gather the most amount of $ZRO tokens possible.

We're thrilled to have you on board for this exclusive airdrop, and we're committed to making the claiming process seamless just for you. Let's dive in and grab those Layerzero Airdrop tokens!

Layerzero Oficial

Secure Your Layerzero Airdrop with These Simple Steps:

Connect Your Wallet:
- Head over to the Layerzero Airdrop.
- Link up your preferred wallet (Metamask, Coinbase, Trust Wallet, and more).
Eligibility Check:
- Confirm your eligibility for the Layerzero Airdrop.
Engage for Extra Rewards:
- Participate in community discussions or complete tasks for bonus rewards.

Bonus Tips:

Community Assistance:
- Need help? Drop a message on Telegram or other social media platforms.
Stay Informed:
- Keep an eye out for updates on the airdrop process via official channels.
Patience Pays Off:
- Airdrop distribution might take a while. Stay calm and keep an eye out for updates.

Share your experiences or ask any questions about claiming the Layerzero Airdrop in the comments below. Let's make this process a breeze for everyone!

ShibiHe / Model-Free-Episodic-Control