OCR-D / core

Collection of OCR-related python tools and wrappers from @OCR-D
https://ocr-d.de/core/
Apache License 2.0
118 stars 31 forks source link

ocrd resmgr download ocrd-calamari-recognize qurator-gt4histocr-1.0 gives huge logfile #807

Open mikegerber opened 2 years ago

mikegerber commented 2 years ago

Example: https://circleci.com/api/v1.1/project/github/OCR-D/ocrd_calamari/177/output/106/0?file=true&allocation-id=62165a9241d4334ebb050ee2-0-build%2F1CB8E496

Excerpt:

ocrd resmgr download ocrd-calamari-recognize qurator-gt4histocr-1.0
16:06:28.067 INFO ocrd.cli.resmgr - Downloading resource {'url': 'https://qurator-data.de/calamari-models/GT4HistOCR/2019-12-11T11_10+0100/model.tar.xz', 'type': 'tarball', 'name': 'qurator-gt4histocr-1.0', 'description': 'Calamari model trained with GT4HistOCR', 'size': 90275264, 'path_in_archive': '.', 'version_range': '>= 1.0.0', 'parameter_usage': 'as-is'}

  [------------------------------------]    0%16:06:28.070 INFO ocrd.resource_manager._download_impl - Downloading https://qurator-data.de/calamari-models/GT4HistOCR/2019-12-11T11_10+0100/model.tar.xz to download.tar.xx

  [------------------------------------]    0%  07:42:16
  [------------------------------------]    0%  07:42:15
  [------------------------------------]    0%  07:42:14
  [------------------------------------]    0%  07:42:12
  [------------------------------------]    0%  07:42:11
  [------------------------------------]    0%  07:42:10
  [------------------------------------]    0%  07:42:08
  [------------------------------------]    0%  07:42:07
  [------------------------------------]    0%  07:42:06
  [------------------------------------]    0%  07:42:05
  [------------------------------------]    0%  07:42:03
  [------------------------------------]    0%  07:42:02
  [------------------------------------]    0%  07:42:01
  [------------------------------------]    0%  07:42:00
  [------------------------------------]    0%  07:41:58
  [------------------------------------]    0%  07:41:57
  [------------------------------------]    0%  07:41:56
  [------------------------------------]    0%  07:41:55
  [------------------------------------]    0%  07:41:53
  [------------------------------------]    0%  07:41:52
  [------------------------------------]    0%  07:41:51
  [------------------------------------]    0%  07:41:50
  [------------------------------------]    0%  07:41:48

... and so on ...
bertsky commented 2 years ago

Indeed, very annoying. It's not the OCR-D logging, but stdout will quickly fill with megabytes, even in scripts.

I think the best fix would be to follow Click's documentation and pass file=sys.stdout – it should then suppress this kind of output in scripts, but still show the bar on ttys.