Memory issue with multiclass training

jinhangw commented 7 years ago

Thank you so much for the guidance on multiclass training in #15 and #29 . I was able to start the training on my dataset (470 images total, same size as Kitti). My computer has 8GB GPU, 8GB RAM. The process gets killed every time at step 1900/12000 evaluation. No error was shown in the log files also. Could anyone advise on this issue please?

Thanks!

villanuevab commented 7 years ago

Are you sure it is a memory issue? Can you share the stacktrace or output stderr to some log file? Errors may have been printed to stderr but not to the output.log files.

You need to write your own eval code because TensorVision currently only works in the binary case. See the FAQ: https://github.com/MarvinTeichmann/KittiSeg/blob/master/docu/FAQ.md for more details. In particular:

In addition, you will need to write new evaluation code. The current evaluator file computes kitti scores which are only defined on binary segmentation problems.

The training script calls the evaluator during training at the steps defined in your hypes .json (default is KittiSeg.json). See:

"logging": {
    "display_iter": 20,
    "eval_iter": 100,
    "write_iter": 100,
    "save_iter": 2000,
    "image_iter": 20000
  },

In the above case, for example, the eval script will be called at every 100th training step.

In order to allow the training script to run without writing your own eval code, you must comment out the following lines:

NingNingL commented 6 years ago

@jinhangw I also want to do multiclass, can you say I need to modify those places? Thank you

MarvinTeichmann / KittiSeg

Memory issue with multiclass training #66