Toilal / rebulk

Define simple search patterns in bulk to perform advanced matching on any string
MIT License
55 stars 9 forks source link

Chain patterns are not properly detected in certain scenarios #7

Closed ratoaq2 closed 7 years ago

ratoaq2 commented 7 years ago

For rebulk version 0.7.6

The minimum I can do is share a failing test case:

def test_matches_7():
    seps_surround = partial(chars_surround, ' .-/')
    rebulk = Rebulk()
    rebulk.regex_defaults(flags=re.IGNORECASE)
    rebulk.defaults(validate_all=True,
                    children=True, private_parent=True,
                    validator={'__parent__': seps_surround})

    rebulk.chain().\
        regex(r'S(?P<season>\d+)').\
        regex(r'[ -](?P<season>\d+)').repeater('*')

    matches = rebulk.matches("Some S01")
    assert len(matches) == 1
    matches[0].value = 1

    matches = rebulk.matches("Some S01-02")
    assert len(matches) == 2
    matches[0].value = 1
    matches[1].value = 2

    matches = rebulk.matches("programs4/Some S01-02")
    assert len(matches) == 2
    matches[0].value = 1
    matches[1].value = 2

    matches = rebulk.matches("programs4/SomeS01middle.S02-03.andS04here")
    assert len(matches) == 2
    matches[0].value = 2
    matches[1].value = 3

    matches = rebulk.matches("Some 02.and.S04-05.here")
    assert len(matches) == 2
    matches[0].value = 4
    matches[1].value = 5

This bug is related to https://github.com/guessit-io/guessit/issues/359

Toilal commented 7 years ago

Thank you, it really helps. I'm debugging at the moment.