mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
688 stars 125 forks source link

"kraken get 10.5281/zenodo.2577813" and "kraken list" commands failing after successful kraken==4.3.13 install in WSL2 #561

Closed Blu5Morpheus closed 6 months ago

Blu5Morpheus commented 6 months ago

Like the title says I'm trying to get kraken running on WSL2 which is running Ubuntu 22.04. The pip install works without error but when I try to run the kraken get or kraken list commands i get the following error:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/cyberwitch/ocrenv/bin/kraken:8 in <module>                                                 │
│                                                                                                  │
│   5 from kraken.kraken import cli                                                                │
│   6 if __name__ == '__main__':                                                                   │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(cli())                                                                          │
│   9                                                                                              │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:1157 in __call__              │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:1078 in main                  │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:1719 in invoke                │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:1434 in invoke                │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:783 in invoke                 │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/decorators.py:33 in new_func          │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/kraken/kraken.py:682 in list_models         │
│                                                                                                  │
│   679 │                                                                                          │
│   680 │   with KrakenProgressBar() as progress:                                                  │
│   681 │   │   download_task = progress.add_task('Retrieving model list', total=0, visible=True   │
│ ❱ 682 │   │   model_list = repo.get_listing(lambda total, advance: progress.update(download_ta   │
│   683 │   for id, metadata in model_list.items():                                                │
│   684 │   │   message('{} ({}) - {}'.format(id, ', '.join(metadata['type']), metadata['summary   │
│   685 │   ctx.exit(0)                                                                            │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/kraken/repo.py:228 in get_listing           │
│                                                                                                  │
│   225 │   total = resp['hits']['total']                                                          │
│   226 │   callback(total, 0)                                                                     │
│   227 │   records.extend(resp['hits']['hits'])                                                   │
│ ❱ 228 │   while 'next' in resp['links']:                                                         │
│   229 │   │   logger.debug('Fetching next page')                                                 │
│   230 │   │   r = requests.get(resp['links']['next'])                                            │
│   231 │   │   r.raise_for_status()                                                               │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'links'`

I've also tried to install various versions of Kraken on multiple versions of Python in Python venvs and everyone gives the same error on the commands I listed. I tried to install Kraken using Conda too, but Conda can't install it (can't solve the environment).

mittagessen commented 6 months ago

Yes, Zenodo changed their API without doing any of the things you're supposed to do when effecting API changes (announcements, versioning, ...). It is fixed in main but any releases don't work anymore with the Zenodo repository.

Sorry for the inconvenience.

On 24/01/02 04:46PM, Blu5Morpheus wrote:

Like the title says I'm trying to get kraken running on WSL2 which is running Ubuntu 22.04. The pip install works without error but when I try to run the kraken get or kraken list commands i get the following error:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/cyberwitch/ocrenv/bin/kraken:8 in <module>                                                 │
│                                                                                                  │
│   5 from kraken.kraken import cli                                                                │
│   6 if __name__ == '__main__':                                                                   │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(cli())                                                                          │
│   9                                                                                              │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:1157 in __call__              │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:1078 in main                  │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:1719 in invoke                │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:1434 in invoke                │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/core.py:783 in invoke                 │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/click/decorators.py:33 in new_func          │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/kraken/kraken.py:682 in list_models         │
│                                                                                                  │
│   679 │                                                                                          │
│   680 │   with KrakenProgressBar() as progress:                                                  │
│   681 │   │   download_task = progress.add_task('Retrieving model list', total=0, visible=True   │
│ ❱ 682 │   │   model_list = repo.get_listing(lambda total, advance: progress.update(download_ta   │
│   683 │   for id, metadata in model_list.items():                                                │
│   684 │   │   message('{} ({}) - {}'.format(id, ', '.join(metadata['type']), metadata['summary   │
│   685 │   ctx.exit(0)                                                                            │
│                                                                                                  │
│ /home/cyberwitch/ocrenv/lib/python3.10/site-packages/kraken/repo.py:228 in get_listing           │
│                                                                                                  │
│   225 │   total = resp['hits']['total']                                                          │
│   226 │   callback(total, 0)                                                                     │
│   227 │   records.extend(resp['hits']['hits'])                                                   │
│ ❱ 228 │   while 'next' in resp['links']:                                                         │
│   229 │   │   logger.debug('Fetching next page')                                                 │
│   230 │   │   r = requests.get(resp['links']['next'])                                            │
│   231 │   │   r.raise_for_status()                                                               │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'links'`

I've also tried to install various versions of Kraken on multiple versions of Python in Python venvs and everyone gives the same error on the commands I listed. I tried to install Kraken using Conda too, but Conda can't install it (can't solve the environment).

-- Reply to this email directly or view it on GitHub: https://github.com/mittagessen/kraken/issues/561 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

HelgeS commented 6 months ago

I can confirm kraken get 10.5281/zenodo.2577813 works when installing from HEAD, but kraken list unfortunately does not.

$ kraken list
Retrieving model list ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   4% 1/26 -:--:-- 0:00:02
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/helge/.pyenv/versions/3.10.9/envs/dlrgocr/bin/kraken:8 in <module>                        │
│                                                                                                  │
│   5 from kraken.kraken import cli                                                                │
│   6 if __name__ == '__main__':                                                                   │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(cli())                                                                          │
│   9                                                                                              │
│                                                                                                  │
│ /Users/helge/.pyenv/versions/dlrgocr/lib/python3.10/site-packages/click/core.py:1157 in __call__ │
│                                                                                                  │
│ /Users/helge/.pyenv/versions/dlrgocr/lib/python3.10/site-packages/click/core.py:1078 in main     │
│                                                                                                  │
│ /Users/helge/.pyenv/versions/dlrgocr/lib/python3.10/site-packages/click/core.py:1719 in invoke   │
│                                                                                                  │
│ /Users/helge/.pyenv/versions/dlrgocr/lib/python3.10/site-packages/click/core.py:1434 in invoke   │
│                                                                                                  │
│ /Users/helge/.pyenv/versions/dlrgocr/lib/python3.10/site-packages/click/core.py:783 in invoke    │
│                                                                                                  │
│ /Users/helge/.pyenv/versions/dlrgocr/lib/python3.10/site-packages/click/decorators.py:33 in      │
│ new_func                                                                                         │
│                                                                                                  │
│ /Users/helge/src/kraken/kraken/kraken.py:680 in list_models                                      │
│                                                                                                  │
│   677 │                                                                                          │
│   678 │   with KrakenProgressBar() as progress:                                                  │
│   679 │   │   download_task = progress.add_task('Retrieving model list', total=0, visible=True   │
│ ❱ 680 │   │   model_list = repo.get_listing(lambda total, advance: progress.update(download_ta   │
│   681 │   for id, metadata in model_list.items():                                                │
│   682 │   │   message('{} ({}) - {}'.format(id, ', '.join(metadata['type']), metadata['summary   │
│   683 │   ctx.exit(0)                                                                            │
│                                                                                                  │
│ /Users/helge/src/kraken/kraken/repo.py:265 in get_listing                                        │
│                                                                                                  │
│   262 │   │   # merge metadata.jsn into DataCite                                                 │
│   263 │   │   key = record['metadata']['doi']                                                    │
│   264 │   │   models[key] = record['metadata']                                                   │
│ ❱ 265 │   │   models[key].update({'graphemes': metadata['graphemes'],                            │
│   266 │   │   │   │   │   │   │   'summary': metadata['summary'],                                │
│   267 │   │   │   │   │   │   │   'script': metadata['script'],                                  │
│   268 │   │   │   │   │   │   │   'link': record['links']['latest'],                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
UnboundLocalError: local variable 'metadata' referenced before assignment
Blu5Morpheus commented 6 months ago

That's a shame. Thanks for letting me know @mittagessen and @HelgeS.

mittagessen commented 6 months ago

On 24/01/03 07:22PM, Blu5Morpheus wrote:

Closed #561 as completed.

list definitely worked the last time I checked (December) but there's an invalid record in the repository now that caused this regression. I've committed a fix that adds a bit more error checking so everything runs as expected again.

HelgeS commented 6 months ago

Can confirm kraken list works now with the latest HEAD. Thanks for the quick fix!

$  kraken list
[01/04/24 12:46:04] WARNING  No metadata found for record '10.5281/zenodo.10259287'.                                                                                                                                  repo.py:264
Retrieving model list ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━  77% 20/26 0:00:06 0:00:18
10.5281/zenodo.10066219 (pytorch) - CATMuS Medieval
10.5281/zenodo.8425684 (pytorch) - Experimental printed syriac model
10.5281/zenodo.8193498 (pytorch) - Transcription model for Lucien Peraire's handwriting (French, 20th century)
10.5281/zenodo.7933463 (pytorch) - HTR model for German manuscripts trained from several datasets
10.5281/zenodo.7933402 (pytorch) - Fraktur model trained from enhanced Austrian Newspapers dataset
10.5281/zenodo.7631619 (pytorch) - Model trained on all available data, from 8th to 16th century, from GalliCorpora, CREMMA Medieval and CREMMA Medieval Lat, as well as data from Eutyches, Caroline Minuscule, DecameronFR. Transcription guidelines: https://hal.archives-ouvertes.fr/hal-03697382
10.5281/zenodo.7410529 (pytorch) - Gallicorpora+ ancient prints (Litterature)
10.5281/zenodo.7051646 (pytorch) - Printed Urdu Base Model Trained on the OpenITI Corpus
10.5281/zenodo.7051644 (pytorch) - Printed Persian Base Model Trained on the OpenITI Corpus
10.5281/zenodo.7050342 (pytorch) - Printed Ottoman Base Model Trained on the OpenITI Corpus
10.5281/zenodo.7050296 (pytorch) - Printed Arabic Base Model Trained on the OpenITI Corpus
10.5281/zenodo.7050270 (pytorch) - Printed Arabic-Script Base Model Trained on the OpenITI Corpus
10.5281/zenodo.6669508 (pytorch) - Cremma-Medieval Old French Model (Litterature)
10.5281/zenodo.6657809 (pytorch) - Model train on openly licensed data from HTR-United. All French manuscript data from the 17th century to the 21st were used (72k lines).
10.5281/zenodo.6542744 (pytorch) - LECTAUREP Contemporary French Model (Administration)
10.5281/zenodo.5468665 (pytorch) - Medieval Hebrew manuscripts in Sephardi bookhand version 1.0
10.5281/zenodo.5468573 (pytorch) - Medieval Hebrew manuscripts in Italian bookhand version 1.0
10.5281/zenodo.5468478 (pytorch) - Medieval Hebrew manuscripts in Ashkenazi bookhand
10.5281/zenodo.5468286 (pytorch) - Medieval Hebrew manuscripts version 1.0