Closed nealzGu closed 6 years ago
I imagine this is caused by your tweaks to keras-retinanet. With regards to the python path, execute from the root of the project. export PYTHONPATH=./ python scripts/train.py ...
should suffice to avoid that cannot find keras-retinanet
issue.
EDIT: Bear in mind that a big merge just took place 2 hours ago for Multi-GPU training. Ensure you have the latest code that aligns with the Readme :)
hi, thanks for the reply. what do you mean by "cannot find...?"
it's not finding the class name "Projection" in the listed class names. the listed class names come from /preprocessing/pascal_voc.py In this file I changed the list, so that it's matching my custom annotations.
Now I also found a pascal_voc.py file in ~/keras-retinanet/build/.../preprocessing Changing the list of class names there has no effect.
Not sure what went wrong here ... :-/
I'm not sure where you're getting im8_scene01031-1071.xml
from? What is the dataset which you are passing? Is this Inference, or Training? Going to need a few more execution details to understand what you're trying to do, and what could be the issue.
Training with pascal, but want to change the images and labels only (keep rest of pascal structure as it is). that worked with an older branch. so images and labels are custom.
my dataset is located at ~/data/...
maybe it's just not optimal or possible to train custom data with pascal structure?
will try to use the CSV way in the meantime!
It does seem odd that it used to work. I can understand the intuition behind that approach. Presumably there's just a place that has been missed in either the data or the code for classes; related to the recent changes. Going with the CSV approach sounds sensible, it is the intended custom dataset approach. Probably leave the issue open, as the developers might have more of an understanding of why this might be the case than I do.
ok, thanks!
following the csv approach, I:
Traceback (most recent call last):
File "scripts/train.py", line 187, in <module>
train_generator, validation_generator = create_generators(args)
File "scripts/train.py", line 130, in create_generators
args.csv_path,
AttributeError: 'Namespace' object has no attribute 'csv_path'
any ideas on this one? :)
Caused by the argparser not having an option for the csv_path. I imagine in previous versions of code argparser has csv_path, mimicking pascal_path, and coco_path argument options.
Either change train.py:133, or add an argument which creates args.csv_path near L160. Hope this helps.
ok, think I got that one.
now its throwing this at me:
Traceback (most recent call last):
File "scripts/train.py", line 188, in <module>
train_generator, validation_generator = create_generators(args)
File "scripts/train.py", line 133, in create_generators
batch_size=args.batch_size
File "build/bdist.linux-x86_64/egg/keras_retinanet/preprocessing/csv.py", line 125, in __init__
AttributeError: 'module' object has no attribute 'reader'
do you have another idea?
There was indeed a mismatch in the arguments for CSV training, should be fixed on master
now. Apparently naming the CSV generator file csv.py
causes a conflict with the python module csv
, so I renamed it back to csv_generator.py
.
Regarding training on a custom VOC-style dataset, it should be possible. You can pass your custom dictionary of class_name -> label here.
hmm I removed everything and installed the master branch again.
with "pascal" option Im now getting a from keras_retinanet.preprocessing.csv_generator import CSVGenerator ImportError: No module named csv_generator
The generator file is there and it is imported ...
Did you update your installed version of keras-retinanet
? Or did you prepend your command with PYTHONPATH=/path/to/keras-retinanet
?
I removed keras-retinanet and cocapi folder from /home/user/
cloned repo ran python setup.py install from inside the repo ran pip install --user --upgrade git+https://github.com/broadinstitute/keras-resnet from inside retinanet repo (and later from home directory) installed cocoapi as described changed the class labels ran python scripts/train.py pascal <path to VOCdevkit/VOC2007>
the error message is also showing for my custom data with csv approach ... :-/
Pythonpath is correct ...
Can you share the complete error ? Including the traceback.
ivision@ivision:~/keras-retinanet$ python scripts/train.py pascal /home/ivision/data/VOCdevkit/VOC2012/
Using TensorFlow backend.
Traceback (most recent call last):
File "scripts/train.py", line 31, in <module>
from keras_retinanet.preprocessing.csv_generator import CSVGenerator
ImportError: No module named csv_generator
the files are all there...
Can you try running:
PYTHONPATH=. python scripts/train.py pascal /home/ivision/data/VOCdevkit/VOC2012/
same error :-/
@nealzGu: Could you post console output and and tracebacks in code blocks using triple backticks? For a markdown cheatsheet see: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet
sorry, yes I will give my best next time ;)
so, I uninstalled and installed everything again. no missing csv_generator anymore. wohoo!
now Im back to the classes, which I changed in keras_retinanet/preprocessing/pascal_voc.py
still error :-( appreciate your help!
None
Epoch 1/50
Traceback (most recent call last):
File "scripts/train.py", line 211, in <module>
callbacks=callbacks,
File "/usr/local/lib/python2.7/dist-packages/keras/legacy/interfaces.py", line 87, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 2115, in fit_generator
generator_output = next(output_generator)
File "/usr/local/lib/python2.7/dist-packages/keras/utils/data_utils.py", line 735, in get
six.reraise(value.__class__, value, value.__traceback__)
File "/usr/local/lib/python2.7/dist-packages/keras/utils/data_utils.py", line 635, in data_generator_task
generator_output = next(self._generator)
File "build/bdist.linux-x86_64/egg/keras_retinanet/preprocessing/generator.py", line 226, in next
File "build/bdist.linux-x86_64/egg/keras_retinanet/preprocessing/generator.py", line 198, in compute_input_output
File "build/bdist.linux-x86_64/egg/keras_retinanet/preprocessing/generator.py", line 78, in load_annotations_group
File "build/bdist.linux-x86_64/egg/keras_retinanet/preprocessing/pascal_voc.py", line 163, in load_annotations
File "/home/ivision/.local/lib/python2.7/site-packages/six.py", line 737, in raise_from
raise value
ValueError: invalid annotations file: 058.xml: could not parse object #0: class name 'Projection' not found in classes: ['sheep', 'horse', 'bicycle', 'bottle', 'cow', 'sofa', 'bus', 'dog', 'cat', 'person', 'train', 'diningtable', 'aeroplane', 'car', 'pottedplant', 'tvmonitor', 'chair', 'bird', 'boat', `'motorbike']
trying to train on custom data gives me this message. tried to replace it with the pascal image generator, but this only leads to another error...
any ideas how I should add this generator??
ivision@ivision:~/keras-retinanet$ python scripts/train.py csv ~/data/VOCdevkit/VOC2012/ImageSets/Main/annotations.csv ~/data/VOCdevkit/VOC2012/ImageSets/Main/classes.csv
Using TensorFlow backend.
2017-12-06 16:46:04.466624: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2017-12-06 16:46:04.601368: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-12-06 16:46:04.601795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: Quadro P4000 major: 6 minor: 1 memoryClockRate(GHz): 1.2275
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.60GiB
2017-12-06 16:46:04.601808: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro P4000, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "scripts/train.py", line 193, in <module>
train_generator, validation_generator = create_generators(args)
File "scripts/train.py", line 138, in create_generators
image_datrain_image_data_generator,
NameError: global name 'image_datrain_image_data_generator' is not defined
invalid annotations file: 058.xml: could not parse object #0: class name 'Projection' not found in classes: ['sheep', 'horse', 'bicycle', 'bottle', 'cow', 'sofa', 'bus', 'dog', 'cat', 'person', 'train', 'diningtable', 'aeroplane', 'car', 'pottedplant', 'tvmonitor', 'chair', 'bird', 'boat', `'motorbike']
As the error says, 058.xml
contains an invalid annotation. It uses class Projection
, which is not a known class. The only known classes are sheep, horse, bicycle, bottle, cow, sofa, bus, dog, cat, person, train, diningtable, aeroplane, car, pottedplant, tvmonitor, chair, bird, boat and motorbike.
yes I see, but I changed the classes in /preprocessing/pascal_voc.py and the pythonpath is ~/keras-retinanet where do those class names come from?!
Traceback (most recent call last):
File "scripts/train.py", line 193, in <module>
train_generator, validation_generator = create_generators(args)
File "scripts/train.py", line 138, in create_generators
image_datrain_image_data_generator,
NameError: global name 'image_datrain_image_data_generator' is not defined
@hgaiser is on this one. UPDATE: it should be fixed on the master branch.
yes I see, but I changed the classes in /preprocessing/pascal_voc.py and the pythonpath is ~/keras-retinanet where do those class names come from?!
They do come from preprocessing/pascal_voc.py
. If they aren't recognized there probably is a pythonpath issue. However, if you have a custom dataset I would recommend using the CSVGenerator
. It has a much simpler format for defining annotations and allows you to specify your own classes without modifying any code.
echo $PYTHONPATH
gives
:/home/ivision/keras-retinanet
I even added
import sys sys.path.append('/home/ivision/keras-retinanet')
at the top of the scripts. What more can I do?
The custom approach is giving the above mentioned error, but lookin forward to hgaisers work :)
any other idea what could be wrong? I mean, the error message says
File "build/bdist.linux-x86_64/egg/keras_retinanet/preprocessing/pascal_voc.py", line 163, in load_annotations
but the folder /build/bdist.linux-x86_64/ is empty. It seems as there would be the pascal_voc.py file containing the wrong class names ... :-/
tried it before, it gives:
Traceback (most recent call last):
File "scripts/train.py", line 193, in <module>
train_generator, validation_generator = create_generators(args)
File "scripts/train.py", line 139, in create_generators
batch_size=args.batch_size
File "build/bdist.linux-x86_64/egg/keras_retinanet/preprocessing/csv_generator.py", line 141, in __init__
TypeError: __init__() takes at least 2 arguments (2 given)
Thanks for the testing by the way, we need to add more unit tests to easily cover these kind of issues ... for now this will have to do though :)
Here is the latest change: https://github.com/fizyr/keras-retinanet/commit/81639172b847054bdca5646a9b69256ce2a45d70 , can you test that?
hey, thank you for all the cool stuff.
unfortunately, Im getting still the same error message :-(
can someone explain why it's referencing a file in /build/.../preprocessing/ ??? I dont understand that...
Traceback (most recent call last):
File "scripts/train.py", line 193, in <module>
train_generator, validation_generator = create_generators(args)
File "scripts/train.py", line 139, in create_generators
batch_size=args.batch_size
File "build/bdist.linux-x86_64/egg/keras_retinanet/preprocessing/csv_generator.py", line 141, in __init__
TypeError: __init__() takes at least 2 arguments (2 given)
Seems its still using some installed version of keras-retinanet
.
What I usually do, since it changes so often, I don't have keras-retinanet
installed and prepend all my commands with PYTHONPATH=.
(and run from the keras-retinanet
root).
Could you pip uninstall keras-retinanet
and try that?
That build folder was probably created by running setup.py
manually. So I would recommend also deleting that folder manually in addition to running pip uninstall keras-retinanet
.
trying to uninstall also throws an error :dagger:
OSError: [Errno 2] No such file or directory: '/usr/local/lib/python2.7/dist-packages/keras_retinanet-0.0.1-py2.7.egg'
I was missing this folder anyway, because I had it when I cloned this repo like 2-3 weeks ago. since then, I did a lot of training on multiple custom data sets. now its a different machine and it seems as you guys did a lot of work since then :D
thanks for your patience. 1) after uninstalling, what would be the root path? cloning the repo, do everything as usual, but do not install, but therefore run every python command with PYTHONPATH?
@de-vri-es yes I ran it manually!
The root as in the main folder of the repository. Alternatively you can prepend it with PYTHONPATH=/path/to/keras-retinanet
and you can run it from anywhere.
thanks man, that did the trick! I just started training with pascal and custom data and it's looking good. I don't really get what was wrong, but for now I'm happy to be back on the training track.
I will quickly check the CSV issue and maybe we can close this one afterwards...
alright, two more things and then I'll leave you alone for today.
csv approach is giving this error message:
Traceback (most recent call last):
File "scripts/train.py", line 208, in <module>
train_generator, validation_generator = create_generators(args)
File "scripts/train.py", line 139, in create_generators
if args.val_path:
AttributeError: 'Namespace' object has no attribute 'val_path'
Can I maybe simply uncomment this (for now)?
inference. my old notebook and scripts from two weeks ago are of course not working any more. I get stuck on this part:
val_generator = CocoGenerator(
'/home/ivision/data/VOCdevkit/VOC20120/',
'test',
val_image_data_generator,
batch_size=1,
)
Where can I find the code for the Coco val_generator to modify? Is there a pascal val_generator already?
Thanks!
alright, two more things and then I'll leave you alone for today.
csv approach is giving this error message: Traceback (most recent call last): File "scripts/train.py", line 208, in
train_generator, validation_generator = create_generators(args) File "scripts/train.py", line 139, in create_generators if args.val_path: AttributeError: 'Namespace' object has no attribute 'val_path'
Just pushed a fix for this.
inference. my old notebook and scripts from two weeks ago are of course not working any more. I get stuck on this part: val_generator = CocoGenerator( '/home/ivision/data/VOCdevkit/VOC20120/', 'test', val_image_data_generator, batch_size=1, ) Where can I find the code for the Coco val_generator to modify? Is there a pascal val_generator already?
I'm not sure what you're doing here? You're using a CocoGenerator
but the path seems to point to Pascal data? Currently, for Pascal, generators are created like:
train_generator = PascalVocGenerator(
args.pascal_path,
'trainval',
train_image_data_generator,
batch_size=args.batch_size
)
validation_generator = PascalVocGenerator(
args.pascal_path,
'test',
val_image_data_generator,
batch_size=args.batch_size
)
ps. thanks again for being our guinea pig. I'm going to add unit tests for this right now cause this shouldn't happen again :)
hi guys, training on custom dataset with pascal format went through. I am able to run inference and its detecting my objects. in fact, it's detecting a lot more ;-)
I could solve this manually (only show bbox for highest confidence in specific range...), but I am wondering why it is this way? any ideas?
Im basically doing
model = keras.models.load_model()
image = misc.imread()
image = np.expand_dims()
_, _, detections = model.predict_on_batch(image)
if score > 0.8: draw rectangle
Haha good :)
Two things though, the generators preprocesses images by subtracting the imagenet mean. I advise to do the same when you're testing the model, otherwise you might get worse results.
Second thing is that it seems like your model is not created using non maximum suppression. Try creating a new model with nms=True
and then calling load_weights
on that model, should work :).
I just set nms = True
in train.py around line 52
starting to train gets an error:
Traceback (most recent call last):
File "scripts/train.py", line 212, in <module>
model, training_model, prediction_model = create_models(num_classes=train_generator.num_classes(), weights=args.weights, multi_gpu=args.multi_gpu)
File "scripts/train.py", line 57, in create_models
prediction_model = keras.models.Model(inputs=model.inputs, outputs=model.outputs[:2] + [detections])
File "/usr/local/lib/python2.7/dist-packages/keras/legacy/interfaces.py", line 87, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 1811, in __init__
'Layer names: ', all_names)
RuntimeError: ('The name "nms" is used 2 times in the model. All layer names should be unique. Layer names: ', ['input_1', 'padding_conv1', 'conv1', 'bn_conv1', 'conv1_relu', 'pool1', 'res2a_branch2a', 'bn2a_branch2a', 'res2a_branch2a_relu', 'padding2a_branch2b', 'res2a_branch2b', 'bn2a_branch2b', 'res2a_branch2b_relu', 'res2a_branch2c', 'res2a_branch1', 'bn2a_branch2c', 'bn2a_branch1', 'res2a', 'res2a_relu', 'res2b_branch2a', 'bn2b_branch2a', 'res2b_branch2a_relu', 'padding2b_branch2b', 'res2b_branch2b', 'bn2b_branch2b', 'res2b_branch2b_relu', 'res2b_branch2c', 'bn2b_branch2c', 'res2b', 'res2b_relu', 'res2c_branch2a', 'bn2c_branch2a', 'res2c_branch2a_relu', 'padding2c_branch2b', 'res2c_branch2b', 'bn2c_branch2b', 'res2c_branch2b_relu', 'res2c_branch2c', 'bn2c_branch2c', 'res2c', 'res2c_relu', 'res3a_branch2a', 'bn3a_branch2a', 'res3a_branch2a_relu', 'padding3a_branch2b', 'res3a_branch2b', 'bn3a_branch2b', 'res3a_branch2b_relu', 'res3a_branch2c', 'res3a_branch1', 'bn3a_branch2c', 'bn3a_branch1', 'res3a', 'res3a_relu', 'res3b_branch2a', 'bn3b_branch2a', 'res3b_branch2a_relu', 'padding3b_branch2b', 'res3b_branch2b', 'bn3b_branch2b', 'res3b_branch2b_relu', 'res3b_branch2c', 'bn3b_branch2c', 'res3b', 'res3b_relu', 'res3c_branch2a', 'bn3c_branch2a', 'res3c_branch2a_relu', 'padding3c_branch2b', 'res3c_branch2b', 'bn3c_branch2b', 'res3c_branch2b_relu', 'res3c_branch2c', 'bn3c_branch2c', 'res3c', 'res3c_relu', 'res3d_branch2a', 'bn3d_branch2a', 'res3d_branch2a_relu', 'padding3d_branch2b', 'res3d_branch2b', 'bn3d_branch2b', 'res3d_branch2b_relu', 'res3d_branch2c', 'bn3d_branch2c', 'res3d', 'res3d_relu', 'res4a_branch2a', 'bn4a_branch2a', 'res4a_branch2a_relu', 'padding4a_branch2b', 'res4a_branch2b', 'bn4a_branch2b', 'res4a_branch2b_relu', 'res4a_branch2c', 'res4a_branch1', 'bn4a_branch2c', 'bn4a_branch1', 'res4a', 'res4a_relu', 'res4b_branch2a', 'bn4b_branch2a', 'res4b_branch2a_relu', 'padding4b_branch2b', 'res4b_branch2b', 'bn4b_branch2b', 'res4b_branch2b_relu', 'res4b_branch2c', 'bn4b_branch2c', 'res4b', 'res4b_relu', 'res4c_branch2a', 'bn4c_branch2a', 'res4c_branch2a_relu', 'padding4c_branch2b', 'res4c_branch2b', 'bn4c_branch2b', 'res4c_branch2b_relu', 'res4c_branch2c', 'bn4c_branch2c', 'res4c', 'res4c_relu', 'res4d_branch2a', 'bn4d_branch2a', 'res4d_branch2a_relu', 'padding4d_branch2b', 'res4d_branch2b', 'bn4d_branch2b', 'res4d_branch2b_relu', 'res4d_branch2c', 'bn4d_branch2c', 'res4d', 'res4d_relu', 'res4e_branch2a', 'bn4e_branch2a', 'res4e_branch2a_relu', 'padding4e_branch2b', 'res4e_branch2b', 'bn4e_branch2b', 'res4e_branch2b_relu', 'res4e_branch2c', 'bn4e_branch2c', 'res4e', 'res4e_relu', 'res4f_branch2a', 'bn4f_branch2a', 'res4f_branch2a_relu', 'padding4f_branch2b', 'res4f_branch2b', 'bn4f_branch2b', 'res4f_branch2b_relu', 'res4f_branch2c', 'bn4f_branch2c', 'res4f', 'res4f_relu', 'res5a_branch2a', 'bn5a_branch2a', 'res5a_branch2a_relu', 'padding5a_branch2b', 'res5a_branch2b', 'bn5a_branch2b', 'res5a_branch2b_relu', 'res5a_branch2c', 'res5a_branch1', 'bn5a_branch2c', 'bn5a_branch1', 'res5a', 'res5a_relu', 'res5b_branch2a', 'bn5b_branch2a', 'res5b_branch2a_relu', 'padding5b_branch2b', 'res5b_branch2b', 'bn5b_branch2b', 'res5b_branch2b_relu', 'res5b_branch2c', 'bn5b_branch2c', 'res5b', 'res5b_relu', 'res5c_branch2a', 'bn5c_branch2a', 'res5c_branch2a_relu', 'padding5c_branch2b', 'res5c_branch2b', 'bn5c_branch2b', 'res5c_branch2b_relu', 'res5c_branch2c', 'bn5c_branch2c', 'res5c', 'res5c_relu', 'P5', 'P5_upsampled', 'C4_reduced', 'P4_merged', 'P4', 'P4_upsampled', 'C3_reduced', 'P6', 'P3_merged', 'C6_relu', 'P3', 'P7', 'regression_submodel', 'anchors_0', 'anchors_1', 'anchors_2', 'anchors_3', 'anchors_4', 'regression', 'classification_submodel', 'concatenate_1', 'classification', 'boxes', 'concatenate_2', 'nms', 'nms'])
is it fine if I rename the detection layer with nms for prediction? or will that be a problem on a later stage?
btw, I noticed that we are not compiling the model any more before inference, but python is giving a warning. Might this be the problem?
That's not possible, it's adding that nms
layer later so you cannot add it twice. Do you use the change of https://github.com/fizyr/keras-retinanet/commit/a0c531d524a80483139b1226fb1515a0c56561fd#diff-f6accbfa89ee01e62ec03e00ddbd7d61 ? If not, please use that (and revert what you modified in train.py
).
What I had meant was, create a new model with nms=True
and load your weights into that model. Normally, this shouldn't be necessary, but there seemed to have been an issue with the prediction_model
in train.py
yesterday (resolved by https://github.com/fizyr/keras-retinanet/commit/a0c531d524a80483139b1226fb1515a0c56561fd#diff-f6accbfa89ee01e62ec03e00ddbd7d61), so any model trained before that has to be loaded this way.
Regarding compiling before inference, it might give a warning but it's not an issue. Compiling is only required for training.
no, I did not use that first change. will do now!
I tried
model = keras.models.load_model(nms=True,weights="...",custom_objects=custom_objects)
but is says unexpected keyword argument 'nms'
I meant something like this:
image = keras.layers.Input((None, None, 3))
model = keras_retinanet.models.resnet.ResNet50RetinaNet(image, num_classes=num_classes, weights='/path/to/your/weights/file.h5`, nms=True)
thanks again! and what will be different for every model I train from now on?
Nice, looks better :)
and what will be different for every model I train from now on?
As in after https://github.com/fizyr/keras-retinanet/commit/a0c531d524a80483139b1226fb1515a0c56561fd#diff-f6accbfa89ee01e62ec03e00ddbd7d61 ? Hopefully you can just run load_model
on those models created after that fix.
thanks a lot! I will close this issue for now. I will probably continue to test the csv-approach soon, but I will just open a new issue when Im experiencing new issues ;-)
you guys helped me a lot!
Good to hear :)
Also, for reference, PR https://github.com/fizyr/keras-retinanet/pull/170 should automate this process a big ;)
Also getting "AttributeError: 'Namespace' object has no attribute 'val_path'"
I just cloned the repo less than an hour ago
@taewookim: If this is a recurring problem please open a new issue with a full stack trace so that we can fix it :)
hi again,
I installed the repo as stated in README. in my ~/.bashrc I added "export PYTHONPATH=/home/ivision/keras-retinanet:$PYTHONPATH"
I am getting the following error message. Is this an installation issue or does it has something to do with my python path? Cause I changed the class names in ~/keras-retinanet/keras_retinanet/preprocessing ...
Thanks!
error message: