lilianweng / stock-rnn

Predict stock market prices using RNN model with multilayer LSTM cells + optional multi-stock embeddings.
https://lilianweng.github.io/lil-log
1.72k stars 656 forks source link

session run is gray in tensorboard->graph, and unknown device #26

Closed elulue closed 5 years ago

elulue commented 5 years ago

Hi, Branch py3 working fine on my PC, I use ubuntu18.04, py3.5 and tensorflow 1.10 with singal video card Nvidia 1070. I see my GPU usage is around 30% while most video card memory been occupied during training. I'd like to see if there's room to improve the performance so goto tensorboard.

But the device is unknown when I check it in tensorboard->graph, also could not see compute time. Could you pls kindly let me know if any tip to fix it ? Thanks a lot.

image

elulue commented 5 years ago

btw, if use tfdbg, I can see - "WARNING:tensorflow:Failed to load partition graphs for device /job:localhost/replica:0/task:0/device:CPU:0 from disk. As a fallback, the client graphs will be used. This may cause mismatches in device names."

elulue commented 5 years ago

I've found the root cause, tensorboard don't record compute time etc by default. I updated below and working for me -


                if np.mod(global_step, show_every_n_step) == 1:
                    run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
                    run_metadata = tf.RunMetadata()

                    train_loss, train_mse, _, train_merged_sum = self.sess.run(
                        [self.loss, self.mse, self.optim, self.summary], train_data_feed,
                        options=run_options, run_metadata=run_metadata)

                    self.writer.add_run_metadata(run_metadata, 'step{}'.format(global_step))
                    self.writer.add_summary(train_merged_sum, global_step=global_step)

                else:
                    train_loss, train_mse, _, train_merged_sum = self.sess.run(
                        [self.loss, self.mse, self.optim, self.summary], train_data_feed)
                    self.writer.add_summary(train_merged_sum, global_step=global_step)