Closed BernardinD closed 5 years ago
Hi @BernardinD , currently we don't have any plan to support TensorFlow after 1.7.1. So the best way would be to stick with the 1.7 version. Just curious, what is your use case for Tensorflow 1.9? I understand newer version is always better, but just wondering what are the things you can can't do on 1.7.
I did ended up running it on 1.7. I started with 1.9 when I first started and before I really knew anything so I was already making graphs in a new version and wasn't looking to retrain everything. Luckily I was given the idea to downgrade TensorFlow and the models repository on a different system and get the frozen graph of that combination using my TensorFlow 1.9 checkpoints. There weren't any compatibility issues that I could see
@BernardinD How did you convert your graph from tensorflow 1.9 to 1.7?
@Amanpradhan I believe all that is needed is that you roll back your models to a time corresponding with 1.7
Hi @SetoKaiba , even if it is working, it won't work for iOS or Android platform since we don't have the corresponding part for tf 1.8 and above. For this reason I won't spend the time to look at your issue.
Hi @Vincent what do you mean? does TF not have have versions 1.8+ for those platforms?
On Thu, Mar 21, 2019 at 2:21 PM Vincent(Yuan) Gao notifications@github.com wrote:
Hi @SetoKaiba https://github.com/SetoKaiba , even if it is working, it won't work for iOS or Android platform since we don't have the corresponding part for tf 1.8 and above. For this reason I won't spend the time to look at your issue.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Unity-Technologies/ml-agents/issues/1389#issuecomment-475347849, or mute the thread https://github.com/notifications/unsubscribe-auth/AOmm9haVkF9rTwoEnpJ_wpcUUFJoFS9mks5vY82vgaJpZM4YCGrS .
@TashaSkyUp We used to use TensorFlowSharp, which doesn't have 1.8+ support for those platform. Now we switch to barracuda, which should have 1.8+ support for those platforms, but we haven't tested them yet.
@SetoKaiba It should work, but it will need extra work on our side to test it.
If there is anything i can do to help please let me know, I worry this may currently be a bit beyond me however.. this is something that is a high priority for me.
On Fri, Mar 22, 2019 at 1:24 PM Vincent(Yuan) Gao notifications@github.com wrote:
@SetoKaiba https://github.com/SetoKaiba It should work, but it will need extra work on our side to test it.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Unity-Technologies/ml-agents/issues/1389#issuecomment-475707985, or mute the thread https://github.com/notifications/unsubscribe-auth/AOmm9gYkyZ7sCN4MRBNh_OvA1Jq-i2FZks5vZRHngaJpZM4YCGrS .
@xiaomaogy Finally I got it working. And I tested the nn model it generates. I fix it by a trick. At least, it's a working one in TF 2.0.
if not isinstance(value_estimate, float):
value_estimate = value_estimate[0][0]
https://github.com/SetoKaiba/ml-agents/blob/tf2/ml-agents/mlagents/trainers/ppo/policy.py#L199-L200
And it's successfully training. And I try the nn file generated as well. It's working. The next step may be change in eager style of 2.0.
> I0402 21:14:33.795373 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 1000. Mean Reward: 1.181. Std of Reward: 0.661. Training. > I0402 21:14:48.256214 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 2000. Mean Reward: 1.223. Std of Reward: 0.755. Training. > I0402 21:15:03.462997 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 3000. Mean Reward: 1.241. Std of Reward: 0.688. Training. > I0402 21:15:19.293986 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 4000. Mean Reward: 1.484. Std of Reward: 0.919. Training. > I0402 21:15:33.623183 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 5000. Mean Reward: 1.979. Std of Reward: 1.400. Training. > I0402 21:15:48.050157 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 6000. Mean Reward: 2.823. Std of Reward: 2.238. Training. > I0402 21:16:02.446742 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 7000. Mean Reward: 4.902. Std of Reward: 4.836. Training. > I0402 21:16:16.705037 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 8000. Mean Reward: 8.218. Std of Reward: 7.453. Training. > I0402 21:16:30.973583 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 9000. Mean Reward: 15.682. Std of Reward: 15.856. Training. > I0402 21:16:45.239661 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 10000. Mean Reward: 22.523. Std of Reward: 22.240. Training. > I0402 21:16:59.680299 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 11000. Mean Reward: 33.677. Std of Reward: 31.949. Training. > I0402 21:17:14.158347 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 12000. Mean Reward: 54.709. Std of Reward: 38.311. Training. > I0402 21:17:28.422827 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 13000. Mean Reward: 70.453. Std of Reward: 37.913. Training. > I0402 21:17:42.778805 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 14000. Mean Reward: 67.481. Std of Reward: 37.354. Training. > I0402 21:17:56.015239 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 15000. Mean Reward: 83.757. Std of Reward: 33.100. Training. > I0402 21:18:16.710064 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 16000. Mean Reward: 91.315. Std of Reward: 26.183. Training. > I0402 21:18:34.961805 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 17000. Mean Reward: 82.100. Std of Reward: 23.276. Training. > I0402 21:18:50.216350 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 18000. Mean Reward: 87.307. Std of Reward: 24.372. Training. > I0402 21:19:04.895818 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 19000. Mean Reward: 84.821. Std of Reward: 25.064. Training. > I0402 21:19:23.605824 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 20000. Mean Reward: 100.000. Std of Reward: 0.000. Training. > I0402 21:19:39.572229 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 21000. Mean Reward: 100.000. Std of Reward: 0.000. Training. > I0402 21:19:55.228630 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 22000. Mean Reward: 97.300. Std of Reward: 8.955. Training. > I0402 21:20:10.517631 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 23000. Mean Reward: 99.417. Std of Reward: 1.935. Training. > I0402 21:20:25.389780 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 24000. Mean Reward: 93.831. Std of Reward: 21.371. Training. > I0402 21:20:40.405708 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 25000. Mean Reward: 100.000. Std of Reward: 0.000. Training. > I0402 21:20:54.933419 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 26000. Mean Reward: 88.800. Std of Reward: 27.110. Training. > I0402 21:21:09.945354 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 27000. Mean Reward: 97.342. Std of Reward: 8.817. Training. > I0402 21:21:24.742134 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 28000. Mean Reward: 72.911. Std of Reward: 37.831. Training. > I0402 21:21:39.403289 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 29000. Mean Reward: 88.950. Std of Reward: 27.259. Training. > I0402 21:21:53.736311 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 30000. Mean Reward: 90.254. Std of Reward: 26.652. Training. > I0402 21:22:08.063346 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 31000. Mean Reward: 86.069. Std of Reward: 30.419. Training. > I0402 21:22:23.146596 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 32000. Mean Reward: 88.807. Std of Reward: 24.430. Training. > I0402 21:22:37.639866 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 33000. Mean Reward: 81.775. Std of Reward: 34.057. Training. > I0402 21:22:52.187232 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 34000. Mean Reward: 90.500. Std of Reward: 25.045. Training. > I0402 21:23:07.060214 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 35000. Mean Reward: 97.192. Std of Reward: 9.314. Training. > I0402 21:23:21.779124 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 36000. Mean Reward: 100.000. Std of Reward: 0.000. Training. > I0402 21:23:36.245430 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 37000. Mean Reward: 68.153. Std of Reward: 40.803. Training. > I0402 21:23:51.253537 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 38000. Mean Reward: 89.762. Std of Reward: 23.634. Training. > I0402 21:24:03.767476 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 39000. Mean Reward: 89.177. Std of Reward: 27.120. Training. > I0402 21:24:18.185374 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 40000. Mean Reward: 100.000. Std of Reward: 0.000. Training. > I0402 21:24:33.363767 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 41000. Mean Reward: 86.923. Std of Reward: 29.122. Training. > I0402 21:24:48.717121 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 42000. Mean Reward: 84.760. Std of Reward: 31.210. Training. > I0402 21:25:05.385900 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 43000. Mean Reward: 97.225. Std of Reward: 8.534. Training. > I0402 21:25:20.480203 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 44000. Mean Reward: 77.807. Std of Reward: 36.498. Training. > I0402 21:25:35.126723 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 45000. Mean Reward: 85.686. Std of Reward: 29.017. Training. > I0402 21:25:50.300158 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 46000. Mean Reward: 81.007. Std of Reward: 33.306. Training. > I0402 21:26:05.132381 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 47000. Mean Reward: 95.008. Std of Reward: 17.294. Training. > I0402 21:26:19.626146 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 48000. Mean Reward: 83.053. Std of Reward: 34.592. Training. > I0402 21:26:34.123156 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 49000. Mean Reward: 91.446. Std of Reward: 20.070. Training. > I0402 21:26:50.629204 3764 trainer_controller.py:87] Saved Model > I0402 21:26:50.648164 3764 trainer.py:169] ppo-0: 3DBallLearning: Step: 50000. Mean Reward: 100.000. Std of Reward: 0.000. Training. > I0402 21:26:52.473239 3764 trainer_controller.py:87] Saved Model > I0402 21:26:52.769763 3764 policy.py:193] List of nodes to export for brain :3DBallLearning > I0402 21:26:52.769763 3764 policy.py:195] is_continuous_control > I0402 21:26:52.769763 3764 policy.py:195] version_number > I0402 21:26:52.769763 3764 policy.py:195] memory_size > I0402 21:26:52.769763 3764 policy.py:195] action_output_shape > I0402 21:26:52.769763 3764 policy.py:195] action > I0402 21:26:52.769763 3764 policy.py:195] action_probs > I0402 21:26:52.770262 3764 policy.py:195] value_estimate > W0402 21:26:52.927511 3764 deprecation.py:323] From C:\Users\Seto\Anaconda3\envs\ml-agents-2\lib\site-packages\tensorflow\python\tools\freeze_graph.py:127: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. > Instructions for updating: > Use standard file APIs to check for files with this prefix. > I0402 21:26:53.616407 3764 saver.py:1280] Restoring parameters from ./models/ppo/3DBallLearning\model-50001.cptk > W0402 21:26:53.769163 3764 deprecation.py:323] From C:\Users\Seto\Anaconda3\envs\ml-agents-2\lib\site-packages\tensorflow\python\tools\freeze_graph.py:232: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. > Instructions for updating: > Use `tf.compat.v1.graph_util.convert_variables_to_constants` > W0402 21:26:53.769163 3764 deprecation.py:323] From C:\Users\Seto\Anaconda3\envs\ml-agents-2\lib\site-packages\tensorflow\python\framework\graph_util_impl.py:247: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. > Instructions for updating: > Use `tf.compat.v1.graph_util.extract_sub_graph` > I0402 21:26:53.832060 3764 graph_util_impl.py:270] Froze 20 variables. > I0402 21:26:53.847037 3764 graph_util_impl.py:303] Converted 20 variables to const ops. > Converting ./models/ppo/3DBallLearning/frozen_graph_def.pb to ./models/ppo/3DBallLearning.nn > IGNORED: Cast unknown layer > IGNORED: StopGradient unknown layer > GLOBALS: 'is_continuous_control', 'version_number', 'memory_size', 'action_output_shape' > IN: 'vector_observation': [-1, 1, 1, 8] => 'sub_3' > IN: 'epsilon': [-1, 1, 1, 2] => 'mul_1' > OUT: 'action', 'action_probs', 'value_estimate' > I0402 21:26:53.933398 3764 policy.py:184] Exported ./models/ppo/3DBallLearning.nn file > DONE: wrote ./models/ppo/3DBallLearning.nn file.
https://github.com/SetoKaiba/ml-agents/tree/tf2 I make a tf2 branch for it. The training is tested working with 3DBall. And the generated nn file is working with Barracuda as well.
https://github.com/SetoKaiba/ml-agents/tree/tf2-develop And also, the develop branch splits ml-agents into two packages. I made a tf2-develop branch for it. The training is tested working with 3DBall. And the generated nn file is working with Barracuda as well.
Hi @SetoKaiba, if you don't mind could you make a PR once you've made it working? The main things to make sure for testing involves these things that I can think of right now.
@xiaomaogy Of course. The updated branch is just replaced with the v1 compat api and the same functional api for contrib module. It's working with TF 2.0a0. But it's incompatible with v1. Which branch should I create the PR for?
@xiaomaogy And about the develop branch which separates two packages. You did update the documentation. But you didn't update the Dockerfile. Maybe there is something else.
@xiaomaogy I tested it with 3DBall, Hallway, GridWorld and Pyramids. They all works. But when I tested it with my own game. The nn file is broken. If you don't mind could you please have a look at the file? https://www98.zippyshare.com/v/E1g7vLaG/file.html
It's strange. I try to train it with 1.7.1 again and get the nn file. It's with the problem as well. But it works for my game before. But yesterday, I change my RayPerception to use native collection. Maybe it's the cause. I'll have a look tomorrow. It's too late in China. I have to go to bed.
Hi @SetoKaiba , thanks for testing out these things! Given just the .nn file of your game I guess I can't do the testing..Also I didn't actually worked on the Barracuda piece myself and am not so familiar with this piece. If you find a reproducible bug please report it and I will get the corresponding guy to look into it.
We are going to release soon, so won't not able to merge in your upgrade of Tensorflow PR for this release.
Also thanks for reminding us about the Dockerfile, we will update it.
The bug just happens when you drag the nn file to a learning brain. So you don't have to test it. And I attach the folder containing frozen graph and so on as well. Barracuda is converting to nn file based on the frozen graph.
@xiaomaogy Is the latest commit on develop branch broken? I try to merge it to my tf2_develop branch. And it breaks my branch. I tried to use develop branch directly. It's broken as well. Here's the log when I try to train.
https://github.com/Unity-Technologies/ml-agents/commit/e59eff493707db84ffdbb1221cc70ba83c255ddf The error is introduced by this commit.
@SetoKaiba Yes it is broke, we are working on fixing it right now.
@SetoKaiba How did you get this bug output? What does the environment use?
@harperj
@xiaomaogy Do you mean this log? Just "mlagents-learn config\trainer_config.yaml --train" on windows using develop branch.
@xiaomaogy Is the latest commit on develop branch broken? I try to merge it to my tf2_develop branch. And it breaks my branch. I tried to use develop branch directly. It's broken as well. Here's the log when I try to train.
Log
@SetoKaiba Yes, but what does the your Unity environment uses? Continuous or Discrete? RNN? curiosity? Or just 3dball?
@xiaomaogy No. Editor. I get it even without training. Just run the command. And it shows up. Normally, it should tell me to hit the Play button in Editor.
@SetoKaiba Ah I see, now we know how to reproduce it. We will look into it. Thanks for raising this!
Thank you. Is the Barracuda issue being looking into as well? I found out the way to reproduce and fill it into #1897 .
@SetoKaiba Yes, I'll work with @vincentpierre to look into it.
@SetoKaiba I'm looking into the pickling bug you ran into as well. I can reproduce it on a Windows machine; it looks like a regression based on our recent update.
@SetoKaiba I have a PR which should address the issue with the subprocess environment here: #1912
We'll make sure this or a similar fix for the pickling issue makes it in before the v0.8 release.
@harperj In the previous implementation. We can stop the game from Unity side. Or by ctrl c to terminate the training. And the model will be saved. But now it doesn't. I don't know whether it will save the model after the whole training. Let me test it and feedback later.
@xiaomaogy Here's the models generated with TensorFlow 2.0. It's merged with the latest commits from 0.8 branch. Maybe you can have a test to see whether it's working in Android and iOS. https://www19.zippyshare.com/v/cgG7AWza/file.html 3DBall https://www19.zippyshare.com/v/cH7uWwrR/file.html Hallway https://www19.zippyshare.com/v/X70sT51v/file.html GridWorld The problem I encountered is proved to be a bug that Barracuda is not working with discrete vector action space with branches. If the test on Android and iOS is ok, maybe we will have a working ml-agents with TensorFlow 2.0. Although it's not converted to eager execution style. At least, it's working.
https://github.com/Unity-Technologies/ml-agents/issues/1850 I think it should be ok to close this issue and have the discussion on this instead. Because I think the next upgrade for it should be 2.0 instead. And I tested my migrated branch is working for CPU and GPU.
@TashaSkyUp We used to use TensorFlowSharp, which doesn't have 1.8+ support for those platform. Now we switch to barracuda, which should have 1.8+ support for those platforms, but we haven't tested them yet.
Is the barracuda conversion possible now ? I am facing difficulty in conversion of tensorflow 1.13.1 to .nn.
The fixes for this are now merged into develop and should come soon in v0.10, so I'm going to close this issue.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Are the Tensorflow versions after 1.7 supported? I have been trying to run a graph in Unity that was trained in Tensorflow 1.9. I tried converting it using the script segment in the Readme but it gives me an error: "TypeError: names_to_saveables must be a dict mapping string names to Tensors/Variables. Not a variable: Tensor("BoxPredictor_0/BoxEncodingPredictor/biases:0", shape=(12,), dtype=float32". Any suggestions?
Thx in advance