BoboTiG / ebook-reader-dict

Finally decent dictionaries based on Wiktionary for your beloved eBook reader.
http://www.tiger-222.fr/?d=2020/04/17/22/14/21-un-dictionnaire-alternatif-et-complet-pour-votre-liseuse
MIT License
386 stars 21 forks source link

[EL] Add EL locale #977

Closed chopinesque closed 3 years ago

chopinesque commented 3 years ago

I am trying to add Greek. I wonder if you could give me some feedback on the regexes. Below you see some examples and what I have come up with so far (I tried editing the IT file). The pronunciation appears to have variant structures, not sure how to accommodate that.

# Regex to find the pronunciation
# {{ΔΦΑ|tɾeˈlos|γλ=el}}
# {{ΔΦΑ|γλ=el|ˈni.xta}}
pronunciation = r"{ΔΦΑ\|γλ=el\|/([^/]+)/"
# Regex to find the gender
# '''{{PAGENAME}}''' {{θ}}
# '''{{PAGENAME}}''' {{ο}}
# '''{{PAGENAME}}''' {{α}}
gender = r"'''{{PAGENAME}}''' ([θαο])"

I tried running it and I got

>> Processing data\el\pages-20210620.xml ...
Traceback (most recent call last):
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\path1\wikidict\wikidict\__main__.py", line 118, in <module>
    sys.exit(main())
  File "C:\path1\wikidict\wikidict\__main__.py", line 110, in main
    parse.main(args["LOCALE"])
  File "C:\path1\wikidict\wikidict\parse.py", line 103, in main
    words = process(file, locale)
  File "C:\path1\wikidict\wikidict\parse.py", line 70, in process
    word, code = xml_parse_element(element, locale)
  File "C:\path1\wikidict\wikidict\parse.py", line 57, in xml_parse_element
    if all(section not in code for section in head_sections[locale]):
KeyError: 'el'

This is all the file

"""Greek language."""
from typing import Dict, Tuple

# Regex to find the pronunciation
# {{ΔΦΑ|tɾeˈlos|γλ=el}}
# {{ΔΦΑ|γλ=el|ˈni.xta}}
pronunciation = r"{ΔΦΑ\|γλ=el\|/([^/]+)/"
# Regex to find the gender
# '''{{PAGENAME}}''' {{θ}}
# '''{{PAGENAME}}''' {{ο}}
# '''{{PAGENAME}}''' {{α}}
gender = r"'''{{PAGENAME}}''' ([θαο])"

# Float number separator
float_separator = ","

# Thousands separator
thousands_separator = " "

# Markers for sections that contain interesting text to analyse.
head_sections = ("{{-el-}}",)
etyl_section = ["{{ετυμολογία}}"]
sections = (
    *head_sections,
    *etyl_section,
    "{{ουσιαστικό}},
    "{{ρήμα}},
    "{{επίθετο}},
    "{{επίρρημα}},
    "{{επίρρημα}},
    "{{σύνδεσμος}},
    "{{συντομομορφή}},
    "{{κύριο όνομα}},
    "{{αριθμητικό}},
    "{{άρθρο}},
    "{{μετοχή}},
    "{{μόριο}},
    "{{αντωνυμία}},
    "{{επιφώνημα}},
    "{{ρηματική έκφραση}},
    "{{επιρρηματική έκφραση}},
)

# Some definitions are not good to keep (plural, gender, ... )
definitions_to_ignore = (
    "{{μορφή ουσιαστικού",
    "{{μορφή ρήματος",
    "{{μορφή επιθέτου",
    "{{εκφράσεις",
)

# Templates to ignore: the text will be deleted.
templates_ignored: Tuple[str, ...] = tuple()

# Templates that will be completed/replaced using italic style.
templates_italic: Dict[str, str] = {}

# Templates more complex to manage.
templates_multi: Dict[str, str] = {
    # {{Term|statistica|it}}   
    # "term": "small(term(parts[1]))",
}

# Release content on GitHub
# https://github.com/BoboTiG/ebook-reader-dict/releases/tag/el
release_description = """\
Αριθμός λέξεων: {words_count}
Εξαγωγή Wiktionary: {dump_date}

Διαθέσιμα αρχεία:

- [Kobo]({url_kobo}) (dicthtml-{locale}.zip)
- [StarDict]({url_stardict}) (dict-{locale}.zip)
- [DictFile]({url_dictfile}) (dict-{locale}.df)

<sub>Aggiornato il {creation_date}</sub>
"""  # noqa

# Dictionary name that will be printed below each definition
wiktionary = "Βικιλεξικό (ɔ) {year}"
lasconic commented 3 years ago

You probably ran it with the it locale, and so the el locale is not found. Let's add greek as a new locale instead ?

chopinesque commented 3 years ago

I ran it with python -m wikidict el

But I am not sure what to do with the tests folder. I duplicated test_it.py and renamed it to test_el.py, but what sort of content should I add?

I can also see that lang\el__pycache\init__.cpython-39.pyc still has "it" content.

lasconic commented 3 years ago

If you copied the it directory from wikidict/lang/it to wikidict/lang/el, you need to modify https://github.com/BoboTiG/ebook-reader-dict/blob/master/wikidict/lang/__init__.py and add el a little bit everywhere.

chopinesque commented 3 years ago

Right, I did that, now I get

 File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\path1\wikidict\wikidict\__main__.py", line 118, in <module>
    sys.exit(main())
  File "C:\path1\wikidict\wikidict\__main__.py", line 71, in main
    from . import find_templates
  File "C:\path1\wikidict\wikidict\find_templates.py", line 8, in <module>
    from .lang import sections
  File "C:\path1\wikidict\wikidict\lang\__init__.py", line 10, in <module>
    from .el.langs import langs as EL
ModuleNotFoundError: No module named 'wikidict.lang.el.langs'
lasconic commented 3 years ago

run

./check.sh

before running anything. It's probably a syntax error and check.sh will give you more info.

lasconic commented 3 years ago

ModuleNotFoundError no idea. Maybe push your code to github?

chopinesque commented 3 years ago

I think I did.

https://github.com/chopinesque/ebook-reader-dict

lasconic commented 3 years ago

Ok, you still need to edit https://github.com/BoboTiG/ebook-reader-dict/blob/master/wikidict/lang/__init__.py to add el

chopinesque commented 3 years ago

I have done it https://github.com/chopinesque/ebook-reader-dict/blob/master/wikidict/lang/__init__.py

Maybe some el sections should not be added in the wikidict/lang/init.py not sure really.

lasconic commented 3 years ago

you don't have the langs.py file in el directory yet. You can delete this line: https://github.com/chopinesque/ebook-reader-dict/blob/497bae1ca97183bea87ceae85021ca88ce056ad5/wikidict/lang/__init__.py#L10 and use EN for a start on this line https://github.com/chopinesque/ebook-reader-dict/blob/497bae1ca97183bea87ceae85021ca88ce056ad5/wikidict/lang/__init__.py#L19

If you know a page listing all the languages in greek with their ISO code. You'll need to write a script similar to scripts/en-langs.py to download and store them.

lasconic commented 3 years ago

Probably this page needs to be downloaded and parsed : https://el.wiktionary.org/wiki/Module:Languages

chopinesque commented 3 years ago

I edited init.py as directed. Not sure how to parse the Module:Languages file.

lasconic commented 3 years ago

If you did it as directed, you should be able to run something like the following without errors.

python -m wikidict el --get-word="λαμβάνω" --raw

You will see the etymology is in but not the definition. In el/init.py, in sections, you might want to add "{{ρήμα|el}".

chopinesque commented 3 years ago

Actually, it was my bad, I had overwritten lang\el__init.py with lang\init__.py Now it is fixed.

I got:

C:\path1\wikidict>python -m wikidict el --get-word="λαμβάνω" --raw
Traceback (most recent call last):
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\path1\wikidict\wikidict\__main__.py", line 118, in <module>
    sys.exit(main())
  File "C:\path1\wikidict\wikidict\__main__.py", line 94, in main
    return get_word.main(args["LOCALE"], args["--get-word"], args["--raw"])
  File "C:\path1\wikidict\wikidict\get_word.py", line 80, in main
    get_and_parse_word(word, locale, raw)
  File "C:\path1\wikidict\wikidict\get_word.py", line 36, in get_and_parse_word
    details = get_word(word, locale)
  File "C:\path1\wikidict\wikidict\get_word.py", line 17, in get_word
    return parse_word(word, code, locale, force=True)
  File "C:\path1\wikidict\wikidict\render.py", line 299, in parse_word
    parsed_sections = find_sections(code, locale)
  File "C:\path1\wikidict\wikidict\render.py", line 275, in find_sections
    wanted = sections[locale]
KeyError: 'el'
lasconic commented 3 years ago

Also you don't have a last_template_handler in el/init.py You can change this line https://github.com/chopinesque/ebook-reader-dict/blob/497bae1ca97183bea87ceae85021ca88ce056ad5/wikidict/lang/__init__.py#L285 to use defaults. last_template_handler. And I guess you will have to add support for several templates.

lasconic commented 3 years ago

Here is a script to download and parse the language list https://gist.github.com/lasconic/468875aaa77c6d8c8a107c0a9a4902c7 and the results a little bit lower. I don't speak greek so it could be completely wrong.

lasconic commented 3 years ago

https://github.com/chopinesque/ebook-reader-dict/commit/6999a30b0992e43c7f9db8df2e4935a19a95fe22#diff-7688315e6bbbfa4f949cce20d41afe97560944d8eea9b8d36113ac618e4f4f90R19 This should be "el":EN

chopinesque commented 3 years ago

OK, all was done apart from "add support for several templates", I guess you mean add a "template_handlers.py" file, not sure how that is done.

lasconic commented 3 years ago

If you can run the following and get etymology and 3 definitions.

python -m wikidict el --get-word="λαμβάνω" --raw

then you can run

python -m wikidict el --find-templates

and it will create a sections.txt and a templates.txt files in the root folder. You need to implement the templates listed in templates.txt. An example of a non implemented template can be find in λαμβάνω. <i>(Μτφρ)</i> καταλαβαίνω should be <i>(μεταφορικά)</i> καταλαβαίνω. The μτφρ template needs to be implemented. It can be done in multiple places depending on the complexity of the template. Check the en/init.py or fr/init.py for some examples in templates_italic, templates_other, templates_multi, and indeed in last resort, last_template_handler

lasconic commented 3 years ago

Before all that, you can create a pull request with your changes, so they can be added to the project. https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request

chopinesque commented 3 years ago

I still get

C:\path1\wikidict>python -m wikidict el --get-word="λαμβάνω" --raw
Traceback (most recent call last):
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\path1\wikidict\wikidict\__main__.py", line 118, in <module>
    sys.exit(main())
  File "C:\path1\wikidict\wikidict\__main__.py", line 94, in main
    return get_word.main(args["LOCALE"], args["--get-word"], args["--raw"])
  File "C:\path1\wikidict\wikidict\get_word.py", line 80, in main
    get_and_parse_word(word, locale, raw)
  File "C:\path1\wikidict\wikidict\get_word.py", line 36, in get_and_parse_word
    details = get_word(word, locale)
  File "C:\path1\wikidict\wikidict\get_word.py", line 17, in get_word
    return parse_word(word, code, locale, force=True)
  File "C:\path1\wikidict\wikidict\render.py", line 299, in parse_word
    parsed_sections = find_sections(code, locale)
  File "C:\path1\wikidict\wikidict\render.py", line 275, in find_sections
    wanted = sections[locale]
KeyError: 'el'
lasconic commented 3 years ago

in wikidict/lang/__init__.py the sections variable is not correct for el

chopinesque commented 3 years ago

You mean this?

lasconic commented 3 years ago

well it looks good and running your master branch i don't have this error...

chopinesque commented 3 years ago

I copied the online code from my master branch into a new dir and I get:

C:\path1\ebook-reader-dict-master>python -m wikidict el --get-word="λαμβάνω" --raw
Traceback (most recent call last):
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\path1\ebook-reader-dict-master\wikidict\__main__.py", line 118, in <module>
    sys.exit(main())
  File "C:\path1\ebook-reader-dict-master\wikidict\__main__.py", line 92, in main
    from . import get_word
  File "C:\path1\ebook-reader-dict-master\wikidict\get_word.py", line 7, in <module>
    from .render import parse_word
  File "C:\path1\ebook-reader-dict-master\wikidict\render.py", line 12, in <module>
    from .lang import (
  File "C:\path1\ebook-reader-dict-master\wikidict\lang\__init__.py", line 31, in <module>
    "el": re.compile(el.pronunciation),
NameError: name 'el' is not defined
lasconic commented 3 years ago

It's because of the commented line : https://github.com/chopinesque/ebook-reader-dict/blob/master/wikidict/lang/__init__.py#L10

chopinesque commented 3 years ago
C:\path1\wikidict>python -m wikidict el --get-word="λαμβάνω" --raw
 !! Missing 'λ' template support for word 'λαμβάνω'
 !! Missing 'ετυμ' template support for word 'λαμβάνω'
λαμβάνω  ''

'<b>λαμβάνω</b> < < <i>(Ετυμ)</i> *<i>sleh₂gʷ</i>-'
lasconic commented 3 years ago

It's running :) As you can see the regex for pronunciation doesn't work. Also no definition are found. Not sure if a gender makes sense for this word, but it's not found as well. Also, you can see the two missing templates that needs to be implemented 'λ' and 'ετυμ'.

I have a working pronunciation regex and the definitions working. Also I put the lang script in and the tests. I will make a PR so you can continue and implement templates.

chopinesque commented 3 years ago

Actually, the pronunciation is not that necessary for Greek Wiktionary. Verbs don't have gender, so shouldn't it be ignored?

I have a working pronunciation regex and the definitions working. Also I put the lang script in and the tests. I will make a PR so you can continue and implement templates.

Thanks for the help!

lasconic commented 3 years ago

I didn't know it was a verb :)

lasconic commented 3 years ago

Please, have a look to the PR.

chopinesque commented 3 years ago

What do I have to do, Fetch upstream > Fetch and Merge from my master?

lasconic commented 3 years ago

The code is on the master of this repository now. Since you modify your master branch (you shouldn't do that in theory). I'm not sure how to fix it with your UI... If I were you, I would delete my fork, delete the code on my computer and clone and fork from scratch.

BoboTiG commented 3 years ago

I'll add the greek locale officially on the README and create the el tag. Nice work :+1:

chopinesque commented 3 years ago

Thanks for all the help guys! Pardon my newbie ignorance.

I get now:

C:\path1\wikidict>python -m wikidict el --get-word="λαμβάνω" --raw
 !! Missing 'λ' template support for word 'λαμβάνω'
 !! Missing 'ετυμ' template support for word 'λαμβάνω'
λαμβάνω \lam.ˈva.nɔ\ ''

'<b>λαμβάνω</b> < < <i>(Ετυμ)</i> *<i>sleh₂gʷ</i>-'

  1. 'παίρνω, δέχομαι'
  2. 'εντοπίζω επιθυμητό σήμα'
  3. '<i>(Μτφρ)</i> καταλαβαίνω'

I suppose these missing templates do not affect the output?

lasconic commented 3 years ago

@BoboTiG thanks ;)

@chopinesque the missing templates do affect the output see how the etymology is not right because of issue #981 and the ετυμ template. Also check the last definition, it's not the same than on wiktionary.

BoboTiG commented 3 years ago

Done. And I also triggered the update job, so that the EL dictionary will be available very soon (https://github.com/BoboTiG/ebook-reader-dict/actions/runs/995887503).

@chopinesque Have a look at https://github.com/BoboTiG/ebook-reader-dict/releases/tag/el where you will find dictionaries.

lasconic commented 3 years ago

I filed two issues for the missing templates: https://github.com/BoboTiG/ebook-reader-dict/issues?q=is%3Aissue+is%3Aopen+label%3Alocale%3A%CE%95%CE%BB%CE%BB%CE%B7%CE%BD%CE%B9%CE%BA%CE%AC

@chopinesque you can try to implement these templates to fix the output for λαμβάνω

chopinesque commented 3 years ago

How do I implement these templates? I tried to generate the dictionary and I got numerous missing template notices. The output was meagre, mostly verbs.

 !! Missing 'βθ' template support for word 'από στήθους'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'αμετάλλακτος' (parts=['ετυμ', 'grc-koi', 'el', ' ἀμετάλλακτος'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'p' template support for word 'απόστολος'
 !! Missing 'θηλ του' template support for word 'γκόμενα'
 !! Missing 'θηλ ού' template support for word 'γλεντζού'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ενδημώ' (parts=['αρχ', ' ἐνδημέω/ -ῶ'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'γραφή του' template support for word 'αμώνω'
 !! Missing 'p' template support for word 'γνησιότητα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'εμπροσθόδρομα ασύμβατος' (parts=['βλ', 'γλ=el', ' εμπροσθόδρομος', 'ασύμβατος'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ανέλκυση' (parts=['ελνστ', 'ἀνέλκυσις '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'β' template support for word 'εν τη ρύμη του λόγου'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'αναβίωση' (parts=[' ελνστ', 'ἀναβίωσις'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'αγν' template support for word 'ενιαυτός'
 !! Missing 'αποδ' template support for word 'εναέρια κυκλοφορία'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'αναβλύζω' (parts=['αρχ', ' ἀναβλύζω'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'αναθέτω' (parts=['αρχ', ' ἀνατίθημι'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'βθ' template support for word 'θου Κύριε'
 !! Missing 'υπερθ' template support for word 'εντονότατα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'εξάνθηση' (parts=['ετυμ', 'grc-koi', 'el', ' ἐξάνθησις'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'cf' template support for word 'κουζίνα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'κουπέπι' (parts=['ετυμ', 'ar', 'el', 'كبابة\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'άγν' template support for word 'κουραμάνα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ιεροφυλάκιο' (parts=['ετυμ', 'grc-koi', 'el', ' ἱεροφυλάκιον'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'θηλ τρα' template support for word 'εξολοθρεύτρα'
 !! Missing 'θηλ τρια' template support for word 'εξολοθρεύτρια'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μούμια' (parts=['ετυμ', 'ar', 'el', 'مُومِيَاء\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'παράθεμα' template support for word 'κρανίου τόπος'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μπααθισμός' (parts=['ετυμ', 'ar', 'el', ' البعث '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'κρεμέζι' (parts=['ετυμ', 'ar', 'el', 'قِرْمِز\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ισθμιακός' (parts=['ετυμ', 'grc', 'el', ' Ἰσθμιακός'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'θηλ του' template support for word 'μπαμ τερλελέ'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'εξώρας' (parts=['ετυμ', 'grc', 'el', ' ἔξωρα'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'επέκταση αρχείου' (parts=['βλ', 'γλ=el', ' επέκταση', 'όνομα', 'αρχείο'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 's' template support for word 'επί ξυρού ακμής'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ιστιοδρομώ' (parts=['ετυμ', 'grc-koi', 'el', ' ἱστιοδρομέω'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μπεκιάρης' (parts=['δαν', 'ota', 'el', '', 'بكار\u200e', 'tr=bekâr'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'ουσεπ α' template support for word 'επαΐων'
 !! Missing 'μτχε' template support for word 'κυμαινόμενος'
 !! Missing 'απαρ' template support for word 'καβαλικεύω'
 !! Missing 'μτχππ' template support for word 'κωλοπετσωμένος'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'κόλπο' (parts=['ετυμ', 'la', 'el', ' colophus'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'ουσεπ θ' template support for word 'καθαρεύουσα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'κόλπος' (parts=['ετυμ', 'la', 'el', ' colophus'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'λάθρα' (parts=['αρχ', ' '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'ουσεπ ο' template support for word 'καθυστερούμενα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μόνωση' (parts=['μτφδ', 'fr', 'el', ' isolation'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'μτχεε' template support for word 'λήγων'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'λαθραίως' (parts=['αρχ', ' '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'επιστεγάζω' (parts=['ετυμ', 'grc', 'el', ' ἐπιστεγάζω'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'λαούτο' (parts=['ετυμ', 'ar', 'el', 'اَلْعُود\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'επιτροχάδην' (parts=['ετυμ', 'grc-koi', 'el', ' ἐπιτροχάδην'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'θηλ ίνα' template support for word 'καλικαντζαρίνα'
 !! Missing 'θηλ ού' template support for word 'καλοφαγού'
 !! Missing 'μτφρ' template support for word 'καλόπιστος'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'εσχάρωση' (parts=['ετυμ', 'grc', 'el', ' ἐσχάρωσις'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'ουσεπ θ' template support for word 'νοτιά'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'νουβέλα' (parts=['ετυμ', 'la', 'el', ' novellus'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'ελνστ' template support for word 'νουνεχής'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'καρδερίνα' (parts=['ετυμ', 'la-lat', 'el', ' cardellus'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ευλογία' (parts=['ετυμ', 'grc-koi', 'el', ' εὐλογία'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'bor' template support for word 'ντουέτο'
 !! Missing 'παράθεμα' template support for word 'καριοφίλι'
 !! Missing 'desc' template support for word 'λουφάρι'
 !! Missing 'συγκρ' template support for word 'νωρίτερα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'κασαμπάς' (parts=['ARchar', 'قَصَبَة\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ευτραφώς' (parts=['ετυμ', 'grc', 'el', ' εὐτραφής'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'επικ' template support for word 'ευχέτης'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'καταζητώ' (parts=['σμσδ', 'fr', 'el', ' poursuivre'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 's' template support for word 'μαγκούρα'
 !! Missing 'αναδρομικός' template support for word 'ξεκάθαρος'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ζενίθ' (parts=['l', 'سمت الرأس', 'ar', 'سَمْتُ الرَّأْس\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ζιρκόνιο' (parts=['ετυμ', 'fa', 'el', 'زرگون\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'παθ' template support for word 'καταπιάνομαι'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'κατασκευάζω' (parts=['ετυμ', 'grc', 'el', ' κατασκευάζω'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μαούνα' (parts=['λ', 'ماعونه\u200e', 'ota', 'tr=maʼuna'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μαούνα' (parts=['etym', 'ota', 'el', 'ماونه\u200e', 'tr=mavuna'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μαραμπού' (parts=['ετυμ', 'ar', 'el', 'مُرابِط\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ηθολογία' (parts=['ετυμ', 'grc-koi', 'el', ' ἠθολογία'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'μτχπε' template support for word 'κατηγορουμένη'
 !! Missing 'χρειάζεται' template support for word 'ηλιόσφαιρα'
 !! Missing 'συγκρ' template support for word 'ηπιότερα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'καφές' (parts=['ετυμ', 'ar', 'el', 'قهوة', 'tr=qahwah', 'قَهْوَة\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'αττ' template support for word 'θάρρος'
 !! Missing 'ουσεπ ο' template support for word 'μελλούμενα'
 !! Missing 'οικονομία' template support for word 'οικονομία'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μεταβίβαση' (parts=['μτφδ', 'fr', 'el', ' transmission'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'κινητήριος' (parts=['μτφδ', 'fr', 'el', ' moteur'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'θεόθεν' (parts=['αρχ', ' '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'μτχπα' template support for word 'μεταστάσα'
 !! Missing 'αρχ' template support for word 'οραματίζομαι'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μητρόθεν' (parts=['αρχ', ' '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'χρειάζεται' template support for word 'μικρόταξη'
 !! Missing 'παράθεμα' template support for word 'ουδέν κακόν αμιγές καλού'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'μονάδα ελέγχου' (parts=['βλ', 'γλ=el', ' μονάδα', 'έλεγχος'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'παρανόηση' (parts=['σμσδ', 'en', 'el', ' misunderstanding'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'σουπίν' template support for word 'πασαδόρος'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'πασμίνα' (parts=['ετυμ', 'fa', 'el', 'پشمینه\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'συναγωνισμός' (parts=['ετυμ', 'gkm', 'el', ' συναγωνισμός'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'ουσεπ ο' template support for word 'υπονοούμενο'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'πατρόθεν' (parts=['αρχ', ' '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'θηλ τρια' template support for word 'πυροσβέστρια'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'υστεραλγία' (parts=['ετυμ', 'grc', 'el', ' ὕστερον'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'μτχππ' template support for word 'πεπαιδευμένος'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'πεπτικός' (parts=['σμσδ', 'fr', 'el', ' digestif'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'φακίρης' (parts=['λ', 'فقراء', 'ar', 'فُقَرَاء\u200e', 'tr=fuqarāʾ'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'βθ' template support for word 'φαλακρός'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'συνυπολογίζω' (parts=['πρόσφ', ' συν-', 'υπολογίζω'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ραμαζάνι' (parts=['ετυμ', 'ota', 'el', 'رمضان\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'φαρφουρί' (parts=['l', 'فغفوری\u200e', 'fa'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'φαρφουρί' (parts=['ετυμ', 'ota', 'el', 'فغفور\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'αντιδάνειο' template support for word 'φασόλι'
 !! Missing 'συγκρ' template support for word 'συχνότερος'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'πεφωτισμένος' (parts=['αρχ', ' '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'σωσίβιο' (parts=['ουδ του', ' σωσίβιος'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'πιατέλα' (parts=['μεγ', ' ', 'α'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'τ' template support for word 'πιπέρι'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'φουκαρατζίκος' (parts=['λ', ' fukaracık', 'tr', 'lang=4'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'δημοτική' template support for word 'σαντάλι'
 !! Missing 'ουσεπ θ' template support for word 'πλατεία'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'σαφράν' (parts=['ετυμ', 'ar', 'el', 'زَعْفَرَان\u200e', 'tr=zaʿfarān'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'σαφράνι' (parts=['ετυμ', 'ar', 'el', 'أَصْفَر\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'μτχεπ' template support for word 'τεθνεώς'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'τερακότα' (parts=[' ετυμ', 'la'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'σερί' (parts=['etym', 'ar', 'el', 'سريع', 'tr=sarīʿ\u200e', 't=γρήγορος', 'سَرِيع'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'σερασκέρης' (parts=['ετυμ', 'fa', 'el', 'سر\u200e', 'tr=sar', 't=κεφάλι'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'πνευμόνι' (parts=[' αρχ'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'συγκρ' template support for word 'σινιόρ'
 !! Missing 'θηλ τρα' template support for word 'χαρτοπαίχτρα'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'σκορπίζω' (parts=['ετυμ', 'grc', 'el', ' σκορπίζω'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'τουρνουά' (parts=[' ετυμ la'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'τρέπω' (parts=['ετυμ', 'ine-pro', 'el', '*trep- '])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'αποδ' template support for word 'χιονοστρόβιλος'
 !! Missing 'bor' template support for word 'τρανσέξουαλ'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'σουλτάνος' (parts=['ετυμ', 'arc', 'el', 'שולטנא\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'προεπιλεγμένη παράμετρος' (parts=['βλ', 'γλ=el', ' προεπιλεγμένος', 'παράμετρος'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'τσακίρ κέφι' (parts=['δαν', 'ota', 'el', 'چاقر كیف\u200e', 'tr=çakır keyf', 'tno=σε κατάσταση μέθης'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'τσουβάλι' (parts=['δαν', 'ota', 'el', 'چوال\u200e', 'tr=çuval'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
 !! Missing 'θηλ τρια' template support for word 'υγιεινίστρια'
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ύσσωπος' (parts=['ετυμ', 'he', 'el', 'אזוב\u200e'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
C:\path1\wikidict\wikidict\utils.py:550: UserWarning: Extra character found in the Wikicode of 'ἐγκαινίασις' (parts=['γραπτηεμφ\u200e', '1876'])
  warn(f"Extra character found in the Wikicode of {word!r} (parts={parts_raw})")
>>> Saved 3,438 words into data\el\data-20210701.json
>>> Render done!
>>> Loading data\el\data-20210701.json ...
>>> Loaded 3,438 words from data\el\data-20210701.json
>>> Generated dict-el.df (1,393,635 bytes)
>>> Generated dicthtml-el.zip (493,003 bytes)
lasconic commented 3 years ago

Have a look to the other languages and how the templates are implemented in en/init.py for example. I believe we need more sections to capture more words.

lasconic commented 3 years ago

I found a name : https://github.com/BoboTiG/ebook-reader-dict/issues/984 I will fix the genre and the extraction of the definition for names (and we will have even more missing templates :)

chopinesque commented 3 years ago

I tried:

python -m wikidict el --find-templates
>>> Loading data\el\data_wikicode-20210701.json ...
>>> Loaded 533,213 words from data\el\data_wikicode-20210701.json
>>> Working, please be patient ...
Traceback (most recent call last):
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\path1\wikidict\wikidict\__main__.py", line 118, in <module>
    sys.exit(main())
  File "C:\path1\wikidict\wikidict\__main__.py", line 73, in main
    return find_templates.main(args["LOCALE"])
  File "C:\path1\wikidict\wikidict\find_templates.py", line 71, in main
    find_templates(in_words, locale)
  File "C:\path1\wikidict\wikidict\find_templates.py", line 44, in find_templates
    f.write(f"    - {entry!r}\n")
  File "C:\Users\spiros\AppData\Local\Programs\Python\Python39\lib\encodings\cp1253.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u1f51' in position 8: character maps to <undefined>
lasconic commented 3 years ago

Weird, it's probably a dependency problem. It works here. The result: https://gist.github.com/lasconic/aa795f488e53049047e833b44e145d0c

@BoboTiG could you explain what we can read in these files ?

chopinesque commented 3 years ago

The only dependency that did not install properly is pyglossary. Does not appear relevant here.

BoboTiG commented 3 years ago

I am AFK and will be able to have a look in several hours only. -- Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez excuser ma brièveté.

BoboTiG commented 3 years ago

@chopinesque could you retry from the master branch?

chopinesque commented 3 years ago

I see, encoding had to be declared.

File sections.txt created.
File templates.txt created.