vvolhejn / acres

CNN-based barcode sharpening
MIT License
24 stars 6 forks source link

Can not get the code to run properly #5

Open Tetsujinfr opened 3 years ago

Tetsujinfr commented 3 years ago

hi

first, thanks for sharing this great project! Looks really cool applied neural net.

I am trying to run the training on a colab instance but getting some python errors. I suspect this is due to some python dependencies versions mismatch but cannot figure out where the problem is.

Here are my commands:

!git clone https://github.com/vvolhejn/acres.git
!pip install tensorflow==1.14
!pip install opencv-python
%cd acres
!python -m acres.binarization.task --dataset-dir data/muenster_blur/ --train-steps 500 --job-dir logs/example/

Here is the error message I get:

Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/content/acres/acres/binarization/task.py", line 178, in tf.logging.dict[args.verbosity] / 10) KeyError: 'INFO'

I am not sure where the root cause is. I have tried to run with python2 and with python3 just in case but I got the same error.

Can you help please? Do I need to install specific versions of the dependencies like tf or numpy? thanks a lot

vvolhejn commented 3 years ago

Hi, glad you're interested in the project. Can you try rerunning with TensorFlow version 1.5? It's been a while since I've looked at the project, but from looking at a commented-out line in setup.py, this seems to be the version I was using. Let me know if that helps.

Tetsujinfr commented 3 years ago

Ok thanks I ll try that version. Actually I already solved the prev issue by installing TF 1.12 (I tried 1.14 but had another issue due to that). I had new issue though but I will try TF1.5 which I did not try yet.

One question: can I train the model just on CPU or do I must use GPU cuda accel? Asking because getting Colab to run on GPU with <TF 2.0 is very difficult not to say impossible nowadays. Thanks

Tetsujinfr commented 3 years ago

so after using TF1.5 the first error is gone but I have a new error , same as the one I had with TF1.12:

!python -m acres.binarization.task --dataset-dir data/muenster_blur/ --train-steps 500 --job-dir logs/example/

GPU: INFO:tensorflow:Using config: {'_model_dir': 'logs/example/task.py-2021-01-03_010546-bs=40,cp=0.0,es=10,ih=600,iw=800,nn=strided32,pc=2,ps=1,ts=500', '_tf_random_seed': 42, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 120, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f57964c1080>, '_task_type': 'worker', '_task_id': 0, '_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} INFO:tensorflow:Running training and evaluation locally (non-distributed). INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 600 secs (eval_spec.throttle_secs) or training is finished. Number of parameters: 1.0 INFO:tensorflow:Create CheckpointSaverHook. Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1329, in _run_fn status, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got unknown format starting with 'version https://' [[Node: DecodeJpeg_1 = DecodeJpegacceptable_fraction=1, channels=3, dct_method="", fancy_upscaling=true, ratio=1, try_recover_truncated=false]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,5,800,1], [?,796]], output_types=[DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/content/acres/acres/binarization/task.py", line 185, in run_experiment(hparams) File "/content/acres/acres/binarization/task.py", line 64, in run_experiment eval_spec) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/training.py", line 432, in train_and_evaluate executor.run_local() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/training.py", line 611, in run_local hooks=train_hooks) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 314, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 815, in _trainmodel , loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss]) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 539, in run run_metadata=run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1013, in run run_metadata=run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1104, in run raise six.reraise(original_exc_info) File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise raise value File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1089, in run return self._sess.run(args, *kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1161, in run run_metadata=run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 941, in run return self._sess.run(args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1128, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1344, in _do_run options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1363, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got unknown format starting with 'version https://' [[Node: DecodeJpeg_1 = DecodeJpegacceptable_fraction=1, channels=3, dct_method="", fancy_upscaling=true, ratio=1, try_recover_truncated=false]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,5,800,1], [?,796]], output_types=[DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

There seems to be something silly happening where instead of getting the training images as input the code receives some string "version https://".

As you can see I do not have any GPU acceleration on colab with TF1.5. Fyi I did install opencv version 3.1.0.4 on the colab instance.

Would you have an idea on how to fix this?

thanks a lot

Tetsujinfr commented 3 years ago

ok, so I googled the error and found this thread. I renamed all the blur training images extensions under acres/data/muenster_blur/images from ".jpg" into ".jpeg" and the error went away! I definitely love TF.

But now there is a piece of code somewhere which still expects the old name with the ".jpg" extension, not sure where it is. Probably easy to fix but I can not see where to change the code in your code base.

Below the new error message, can you point me in the right direction?

GPU: INFO:tensorflow:Using config: {'_model_dir': '/logs/example/task.py-2021-01-03_014850-bs=40,cp=0.0,es=10,ih=600,iw=800,nn=strided32,pc=2,ps=1,ts=500', '_tf_random_seed': 42, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 120, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f4c491f00b8>, '_task_type': 'worker', '_task_id': 0, '_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} INFO:tensorflow:Running training and evaluation locally (non-distributed). INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 600 secs (eval_spec.throttle_secs) or training is finished. Number of parameters: 1.0 INFO:tensorflow:Create CheckpointSaverHook. Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1329, in _run_fn status, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError: data/muenster_blur/images/07_4001608023510-01_N95-2592x1944_scaledTo800x600bilinear.jpg; No such file or directory [[Node: ReadFile = ReadFile]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,5,800,1], [?,796]], output_types=[DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/content/acres/acres/binarization/task.py", line 185, in run_experiment(hparams) File "/content/acres/acres/binarization/task.py", line 64, in run_experiment eval_spec) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/training.py", line 432, in train_and_evaluate executor.run_local() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/training.py", line 611, in run_local hooks=train_hooks) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 314, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 815, in _trainmodel , loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss]) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 539, in run run_metadata=run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1013, in run run_metadata=run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1104, in run raise six.reraise(original_exc_info) File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise raise value File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1089, in run return self._sess.run(args, *kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1161, in run run_metadata=run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 941, in run return self._sess.run(args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1128, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1344, in _do_run options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1363, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: data/muenster_blur/images/07_4001608023510-01_N95-2592x1944_scaledTo800x600bilinear.jpg; No such file or directory [[Node: ReadFile = ReadFile]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,5,800,1], [?,796]], output_types=[DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

thanks

Tetsujinfr commented 3 years ago

ok so I found the line in dataset.py where the image filenames extension is expected as .jpg and I did replace it with .jpeg, but that brought me back to the previous error, i.e. by changing the extensions of the images from .jpg to .jpeg I did not fix the previous error but I did create one earlier in the code...

So I am back stuck with the error (full error log already provided above):

tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected image (JPEG, PNG, or GIF), got unknown format starting with 'version https://'

do you have any idea?

mainulquraishi commented 2 years ago

I am facing the same problem. Did you find any solution this? @Tetsujinfr @vvolhejn

Tetsujinfr commented 2 years ago

@mainulquraishi i did give up on this. If you can afford it, I recommend this commercial solution, it just works amazingly well for barcode detection, even with blurry and noisy images. It is fast as well. https://www.scandit.com/ Good luck.

vvolhejn commented 2 years ago

@mainulquraishi @Tetsujinfr

Hi, sorry for not replying. I'm afraid I don't have the time to maintain this project anymore. I found this StackOverflow thread and the answer seems pretty plausible - the type of the image is determined by "magic bytes" at the beginning of the file, so it seems like it's reading some kind of text file that starts with "version https://" rather than the expected magic bytes of the JPG format.

It's possible that the issue is caused by the fact that the data files are stored using Git Large File Storage and need to be loaded separately. See the Installation section in the README.

In either case, this was an experimental project and I don't think it's suitable for applications - the barcodes are unblurred, but I haven't had much luck trying to scan the unblurred barcodes using software. Good luck!

mainulquraishi commented 2 years ago

Thanks, @vvolhejn and @Tetsujinfr. Yea, lfs download was the problem. I was doing normal git clone and image was not properly downloaded.