Closed PBerit closed 2 years ago
@PBerit
In order to expedite the trouble-shooting process, can you please provide a complete code to reproduce the issue reported here and while reproducing the issue i was getting error at env = DSM_BT1_Env() like NameError: name 'DSM_BT1_Env' is not defined . Thanks!
@pindinagesh : Thanks for your answer pindinagesh. Actually the env = DSM_BT1_Env()
is an custom OpenAI Gym environment for reinforcement learning. It has more than 800 lines of code which is why I did not want to post it. But the environment itself is not the problem. The problem is that Tensorboard sometimes starts and sometimes it does not. Today for example, it started immediately with the commands posted above (and when it has started I don't want to close it because than it might be possible that I can't start it again). I am pretty sure when I'll will have problems in future attemps to start it.
When using TensorBoard in notebook environments (like spyder), TensorBoard attempts to reuse existing instances rather then starting a new instance every time it is invoked.
See details in https://www.tensorflow.org/tensorboard/tensorboard_in_notebooks
The same TensorBoard backend is reused by issuing the same command.
If a different logs directory was chosen, a new instance of TensorBoard would be opened.
Ports are managed automatically.
You may be able to address your issue by
@bileschi : Thanks bileschi for your comment. Actually, even when I have restarted my computer sometimes Tensorflow does not start altough using exactly the same code and the same port number. What I then have to do is to enumerate through different port numbers. Sometimes it can happen that the second number is okay, but sometimes I need to enumerate through 20 port numbers. This is pretty strange and unconfortable.
@bileschi : Thanks for your answer bileschi. Any comments to my last comment? Do you have an idea as to why I always have to enumerate through several port numbers (1 to 20), even after having restarted the computer, to start Tensorboard? I'll highly appreciate every further comment from you.
Hi @PBerit , I'm sorry I don't really know. I suspect it may have something to do with the Spyder environment, but I would need to reproduce locally to be more confident. Unfortunately the TensorBoard team does not have the resources to guarantee support for Spyder. Do you know if you can reproduce the problem in Jupyter?
As a workaround, do you have access to the location where Spyder writes the log files? If so, you can run TensorBoard externally from the Spyder environment, from your own console, which may give you more control and stability. Another option is to try uploading to tensorboard.dev, the hosted solution.
@bileschi : Thanks for your answer and effort bileschi. Okay, than I will just continue to enumerate through different port numbers to start Tensorboard.
Hi all,
I tried using another IDE (PyCharm) as bileschi assumed that the problem is caused by Spyder. But this is not the case. However, when using PyCharm I get a little more information when using a portnumber which leads to the non-starting of Tensorboard. I get the output in the console:
"Reusing TensorBoard on port 8111 (pid 10180), started 0:23:16 ago. (Use '!kill 10180' to kill it.)
Please visit http://localhost:8111 in a web browser."
This means, that I have already used this port number. Unfortunately the instructions don't work. When I type in kill 10180
I get the error message "SyntaxError: invalid syntax". When typing '!kill 10180'
I get the output "'!kill 10180'" but this does not change anything (as I think the second command is treated like a string). Do you have any idea, how I can "kill" that portnumber to make it accessible for Tensorboard?
Any comments to my last comment?
You may be able to kill it from the terminal command line, rather than through the python notebook?
@bileschi : Thanks for your answer bileschi. I tried what you suggested but I also get an error message:
PS C:\Users\User1\Python> !kill 4048
!kill : Die Benennung "!kill" wurde nicht als Name eines Cmdlet, einer Funktion, einer Skriptdatei oder eines ausführbaren Programms erkannt. Überprüfen Sie die Schreibweise des Namens, oder ob der Pfad korrekt ist (sofern enthalten),
und wiederholen Sie den Vorgang.
In Zeile:1 Zeichen:1
+ !kill 4048
+ ~~~~~
+ CategoryInfo : ObjectNotFound: (!kill:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
The first error message is in German and just says that the command !kill! is not recognized as a name of a Cmdlet, a function or a script.
@bileschi: Thanks for your answers bileschi. Any comment to my last comment? I'll highly appreciate every further comment from you.
Responding to comment from 6 days ago, I think the issue may be that you tried the following two commands within your IDE:
kill 10180
'!kill 10180'
But the one that you want to try is
!kill 10180
(yes exclamation point, no quotes)
--
FYI I think that the analog to killing process on Unix is to use the 'taskkill' command.
However, I'm not sure if the PyCharm IDE is going to expose processes that it starts as separate windows processes or if it keeps them all wrapped in its own runtime.
@bileschi : Thanks bileschi for your answer and effort. I really appreciate it. Unfortunately I get an error message when using your suggested code in the PyCharm console: !kill 10180
(with exclamation point and no quotes). It says (translated) "the statement kill is either wrongly spelled or could not be found"
@bileschi : Thanks for your answers bileschi. Any comments to my last comment? I'll highly appreciate every further comment from you.
What is happening here is that the exclamation point prefix is issuing a command to be run in the shell (command line). The command kill
exists in unix to terminate processes by id. Hence, in unix, kill 10180
would terminate the process with id 10180. Your system is running windows, which does not have the command kill
. It has taskkill
instead, so you should look to the manual for that command to determine how to use it. I am not a windows user, so I'm not sure how to use it correctly. It may be as simple as !taskkill /pid 10180
, but I may be missing something.
@bileschi : Thanks for your answer. Unfortuantely your suggested command does not work.
But maybe someone else in the Forum, who has experience with Windows, can answer my question. So I tried several commands (from this website https://winaero.com/kill-process-windows-10/#Kill_a_process_using_PowerShell) that all did not work, with and without exlamation point:
!taskkill /pid 10180
!taskkill /F /pid 10180
!taskkill 10180
!taskkill pid 10180
!Stop-Process -ID 10180 -Force
!Stop-Process -pid 10180 -Force
Would anyone mind telling me how to kill the process such that I can start Tensorboard again on the same port? I'll appreciate every comment.
Any comments on my last comment? Can anyone help me on how to kill a process in Windows 10 (more specifically how to kill the TensorBoard process that blocks a specific port). I'll highly appreciate every further comment.
Does nobody have an idea? I'll appreciate every comment.
Would anyone mind telling me how to kill the process such that I can start Tensorboard again on the same port? I'll appreciate every comment.
Perhaps StackOverflow can help you with your question about finding a running process and killing it from the command line?
I just wanted to mention that in fact the inital problem (sometimes Tensorboard can be started and sometimes not) is related to the use of Tensorboad in Windows. So it is also a problem of Tensorboard itself. But there is a workaround that you can see here: https://stackoverflow.com/questions/59563025/how-to-reset-tensorboard-when-it-tries-to-reuse-a-killed-windows-pid/59582163#59582163. Thanks bileschi for your great help and the advice to ask this on StackOverflow.
You may be able to kill it from the terminal command line, rather than through the python notebook?
On Wed, Apr 20, 2022 at 5:51 AM PBerit @.***> wrote:
Hi all,
I tried using another IDE (PyCharm) as bileschi assumed that the problem is caused by Spyder. But this is not the case. However, when using PyCharm I get a little more information when using a portnumber which leads to the non-starting of Tensorboard. I get the output in the console: "Reusing TensorBoard on port 8111 (pid 10180), started 0:23:16 ago. (Use '!kill 10180' to kill it.) Please visit http://localhost:8111 in a web browser." This means, that I have already used this port number. Unfortunately the instructions don't work. When I type in kill 10180 I get the error message "SyntaxError: invalid syntax". When typing '!kill 10180' I get the output "'!kill 10180'" but this does not change anything (as I think the second command is treated like a string). Do you have any idea, how I can "kill" that portnumber to make it accessible for Tensorboard?
— Reply to this email directly, view it on GitHub https://github.com/tensorflow/tensorboard/issues/5575#issuecomment-1103726905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEFSTUWULYMVNUENKXX22LVF7HQDANCNFSM5PBOIROA . You are receiving this because you were mentioned.Message ID: @.***>
-- Stan Bileschi Ph.D. | SWE | @.*** | 617-230-8081
Hi all,
I just started to use Tensorboard. So I have a Python code (about reinforcement learning) and I run it. Then I insert the following commands into the console of Spyder.
Then I just use my browser (Firefox) and type in: http://localhost:8155
The strange thing is, that sometime Tensorboard is correctly shown in the browser, but sometimes I just get an error essage that the destination can't be reached. In that case it might help to change the port number. But this does not always solve the problem. Sometime even changing the port number 10 times does not start Tensorboard. The strange thing is that altough nothing changes in the code and nothing changes in the commands for starting Tensorboard (and the same computer and browser are also used), sometimes Tensorboard starts immediately from the first used port (8155), sometimes I have to try several port numbers before it starts, and sometimes it does not start at all.
Can anyone of you think about a possible explanation for this behaviour?
Here is the code that I use: