Open b4nn3d opened 4 years ago
I have the same problem on paperspace. If I create any labelled datasets they won't train passed "Exporting sample images" The train from no pkl file also creates an error if resume=None is used, the help describes resume=noresume which works up to the above problem.
for now I recommend using the non-raw formatting until i can figure out whats going on there. you generally cannot use transfer learning with labels unless your label count matches exactly. you can just remove the --resume
argument from the training command
@dvschultz Hi, any solution to this problem?
I'm experiencing the same problem. I didn't create my TF records from the raws, either - they're at 1024x1024 fyi. Any advice for another workaround? Does anyone have an existing pickle with a decent number of labels I can resume from?
I'm not sure how to correctly solve the problem, but the cause is in the setup_snapshot_image_grid()
function in training_loop.py. More precisely, the condition all_len(block) >= cw * ch for block in blocks)
never evaluates to True, so the code keeps looping for one million times. In my case len(blocks)' is 8 and
cw * ch` is 15, and stays that way.
I think what the code is doing is testing whether a full row of images is generated for each class? And I'm not sure if this has anything to do with it, but when getting minibatches the examples with label == 1 are returned until there are no more examples from that class. So I don't think more than one row is ever filled. Maybe there is something wrong with the shuffle implementation in create_from_image_folders()
and create_from_image_folders_raw()
which causes the minibatch to always return examples from the first class?
On colab, it stay forever saying this:
Exporting sample images...
Only raw images (without labels) are training fine, non raw images with labels are training fine too. I've tried only with a 512*512 resolution.