Open paulocoutinhox opened 3 months ago
I try isolate the expression and it works in that case:
from __future__ import annotations
from enum import Enum
from typing import Any
from typing import Type
import re
class Book(Enum):
def __new__(
cls: Type[Book],
*args: dict[str, Any],
**kwargs: dict[str, Any],
) -> Book:
obj: Book = object.__new__(cls)
obj._value_ = args[0]
return obj
def __init__(
self: Book,
_: int,
title: str,
regular_expression: str,
abbreviations: tuple[str, ...],
) -> None:
"""Set the title and regular_expression properties."""
self._title_ = title
self._regular_expression_ = regular_expression
self._abbreviations_ = abbreviations
@property
def title(self: Book) -> str:
return self._title_
@property
def regular_expression(self: Book) -> str:
return self._regular_expression_
@property
def abbreviations(self: Book) -> tuple[str, ...]:
return self._abbreviations_
LEVITICUS = 3, "Leviticus", r"(Lev(?:iticus)?|Lv|Levítico)", ("Lev", "Lv")
BOOK: str = rf"\b({'|'.join(book.regular_expression for book in Book)})\b\.*"
SCRIPTURE_REFERENCE_REGULAR_EXPRESSION: Pattern[str] = re.compile(
BOOK,
re.IGNORECASE | re.UNICODE,
)
test_strings = [
"Lev", "Lv", "Leviticus", "Levítico"
]
for test in test_strings:
match = SCRIPTURE_REFERENCE_REGULAR_EXPRESSION.search(test)
if match:
print(f"Matched: {test} -> {match.group()}")
else:
print(f"No match: {test}")
Output:
Matched: Lev -> Lev
Matched: Lv -> Lv
Matched: Leviticus -> Leviticus
Matched: Levítico -> Levítico
I found the problem:
convert_all_roman_numerals_to_integers(text)
Can you put this optional too?
Thanks.
It solve 4 of 5 problems.
But left one: ob 1
that need goes to OBADIAH
. The text ab 1
works but ob 1
not.
I found the bug:
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/django/core/handlers/exception.py", line 55, in inner
response = get_response(request)
File "/opt/homebrew/lib/python3.10/site-packages/django/core/handlers/base.py", line 197, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/Users/paulo/Developer/workspaces/python/avinuteologia/apps/web/views/bible.py", line 41, in search_view
bible_references = pb.get_references(query)
File "/opt/homebrew/lib/python3.10/site-packages/pythonbible/parser.py", line 50, in get_references
references.extend(normalize_reference(reference_match[0]))
File "/opt/homebrew/lib/python3.10/site-packages/pythonbible/parser.py", line 98, in normalize_reference
first_book_references = _process_sub_references(
File "/opt/homebrew/lib/python3.10/site-packages/pythonbible/parser.py", line 147, in _process_sub_references
start_chapter, start_verse, end_chapter, end_verse = _process_sub_reference(
File "/opt/homebrew/lib/python3.10/site-packages/pythonbible/parser.py", line 194, in _process_sub_reference
start_chapter = int(min_chapter_and_verse[0].strip())
ValueError: invalid literal for int() with base 10: 'b 1'
It is trying to convert ob 1
to int
.
I solved, the problem is that i add single letter for HOSEA. Now it works.
The only problem is make roman number optional because it understand lv
and roman number or check it the letter is not an abbreviation to ignore it.
Hi,
I made a new version of books in a fork to support portuguese/spanish.
I tested one by one, but i have a problem with only 5 books:
For example (above) it understand "lev 1" but not "lv 1", the same for others. I tested with
LEVITICUS = 3, "Leviticus", r"(Lev|Lv)", ("Lev", "Lv")
but it don't work too.I don't know what im doing wrong. Can you point me what is wrong in regexp for this five cases?
Thanks.