Optimize contexted indicator attach

trevorbaca commented 2 years ago

The performance hit to Abjad checking for attachment-time conflicts between contexted indicators is substantial. Here's an 8000-note test score that takes 2 seconds to create, together with examples showing that attaching 16,000 strings or 16,000 articulations takes basically no time:

with abjad.Timer() as timer:
    score = abjad.Score([
        abjad.Staff([abjad.Note("c'4") for _ in range(1000)]),
        abjad.Staff([abjad.Note("c'4") for _ in range(1000)]),
        abjad.Staff([abjad.Note("c'4") for _ in range(1000)]),
        abjad.Staff([abjad.Note("c'4") for _ in range(1000)]),
        abjad.Staff([abjad.Note("c'4") for _ in range(1000)]),
        abjad.Staff([abjad.Note("c'4") for _ in range(1000)]),
        abjad.Staff([abjad.Note("c'4") for _ in range(1000)]),
        abjad.Staff([abjad.Note("c'4") for _ in range(1000)]),
    ])
print(int(timer.elapsed_time))
2

with abjad.Timer() as timer:
    for note in abjad.iterate.leaves(score):
        abjad.attach("foo", note)
        abjad.attach("bar", note)
print(int(timer.elapsed_time))
0

with abjad.Timer() as timer:
    for note in abjad.iterate.leaves(score):
        abjad.attach(abjad.Articulation("staccato"), note)
        abjad.attach(abjad.Articulation("marcato"), note)
print(int(timer.elapsed_time))
0

But because abjad.StartSlur and abjad.StopSlur are contexted (at the level of the voice) attaching 16,000 slur indicators takes about 9 seconds:

with abjad.Timer() as timer:
    for note in abjad.iterate.leaves(score):
        abjad.attach(abjad.StartSlur(), note)
        abjad.attach(abjad.StopSlur(), note)
print(int(timer.elapsed_time))
9

Toy contexted classes shows that contexting is responsible for the entirety of the slowdown:

class ContextedFoo:
    context = "Voice"

class ContextedBar:
    context = "Voice"

with abjad.Timer() as timer:
    for note in abjad.iterate.leaves(score):
        abjad.attach(ContextedFoo(), note)
        abjad.attach(ContextedBar(), note)
print(int(timer.elapsed_time))
12

How to optimize? These are yet more data that point to excess score updates as responsible to most -- and maybe all -- of the best areas for performance optimization in Abjad. Revisit these examples when it is time to optimize Abjad. Then proceed in the usual way of profiling many different examples in the system first, and only optimizing when all slowdowns are fully understood.

A couple of simple strategies are available, however. Perhaps the most direct would be to stop checking for conflicts between context indicators at attach-time; such checks could be deferred until format-time, and also wrapped in public function to allow users to run such checks on demand while composing.

trevorbaca commented 2 years ago

Test functions:

def make_score(staff_count, note_count):
    with abjad.Timer() as timer:
        staves = []
        for i in range(staff_count):
            notes = [abjad.Note("c'4") for _ in range(note_count)]
            voice = abjad.Voice(notes)
            staff = abjad.Staff([voice])
            staves.append(staff)
    total = staff_count * note_count
    time = int(timer.elapsed_time)
    print(f"made {staff_count} * {note_count} == {total} notes in {time} seconds ...")
    return abjad.Score(staves)

def attach_indicators(score, indicator_classes):
    with abjad.Timer() as timer:
        leaves = abjad.select(score).leaves()
        for leaf in leaves:
            for indicator_class in indicator_classes:
                indicator = indicator_class()
                abjad.attach(indicator, leaf)
    total = indicator_count * len(leaves)
    time = int(timer.elapsed_time)
    print(f"attached {total} contexted indicators in {time} seconds ...")

class Foo:
    context = "Voice"

class Bar:
    context = "Voice"

class Baz:
    context = "Voice"

class Qux:
    context = "Voice"

Starting input, output:

for note_count in [100, 200, 400, 800, 1600]:
    score = make_score(4, note_count)
    attach_indicators(score, [Foo])
    print()

made 4 * 100 == 400 notes in 0 seconds ...
attached 400 contexted indicators in 0 seconds ...

made 4 * 200 == 800 notes in 0 seconds ...
attached 800 contexted indicators in 1 seconds ...

made 4 * 400 == 1600 notes in 0 seconds ...
attached 1600 contexted indicators in 7 seconds ...

made 4 * 800 == 3200 notes in 1 seconds ...
attached 3200 contexted indicators in 29 seconds ...

made 4 * 1600 == 6400 notes in 2 seconds ...
attached 6400 contexted indicators in 124 seconds ...

But performance slows as a function of notes-per-voice (not total notes in score):

for note_count in [50, 100, 200, 400, 800]:
    score = make_score(8, note_count)
    attach_indicators(score, [Foo])
    print()

made 8 * 50 == 400 notes in 0 seconds ...
attached 400 contexted indicators in 0 seconds ...

made 8 * 100 == 800 notes in 0 seconds ...
attached 800 contexted indicators in 1 seconds ...

made 8 * 200 == 1600 notes in 0 seconds ...
attached 1600 contexted indicators in 3 seconds ...

made 8 * 400 == 3200 notes in 1 seconds ...
attached 3200 contexted indicators in 14 seconds ...

made 8 * 800 == 6400 notes in 2 seconds ...
attached 6400 contexted indicators in 58 seconds ...

So 1-voice metrics are probably cleanest:

for note_count in [400, 800, 1600, 3200]:
    score = make_score(1, note_count)
    attach_indicators(score, [Foo])
    print()

made 1 * 400 == 400 notes in 0 seconds ...
attached 400 contexted indicators in 1 seconds ...

made 1 * 800 == 800 notes in 0 seconds ...
attached 800 contexted indicators in 7 seconds ...

made 1 * 1600 == 1600 notes in 0 seconds ...
attached 1600 contexted indicators in 29 seconds ...

made 1 * 3200 == 3200 notes in 1 seconds ...
attached 3200 contexted indicators in 120 seconds ...

Observation: as note count doubles, attach time quadruples.

Now we vary indicators-per-leaf.

2 indicators per leaf:

for note_count in [400, 800, 1600, 3200]:
    score = make_score(1, note_count)
    attach_indicators(score, [Foo, Bar])
    print()

made 1 * 400 == 400 notes in 0 seconds ...
attached 800 contexted indicators in 3 seconds ...

made 1 * 800 == 800 notes in 0 seconds ...
attached 1600 contexted indicators in 15 seconds ...

made 1 * 1600 == 1600 notes in 0 seconds ...
attached 3200 contexted indicators in 61 seconds ...

made 1 * 3200 == 3200 notes in 1 seconds ...
attached 6400 contexted indicators in 249 seconds ...

3 indicators per leaf:

for note_count in [400, 800, 1600, 3200]:
    score = make_score(1, note_count)
    attach_indicators(score, [Foo, Bar, Baz])
    print()

made 1 * 400 == 400 notes in 0 seconds ...
attached 1200 contexted indicators in 6 seconds ...

made 1 * 800 == 800 notes in 0 seconds ...
attached 2400 contexted indicators in 24 seconds ...

made 1 * 1600 == 1600 notes in 0 seconds ...
attached 4800 contexted indicators in 90 seconds ...

made 1 * 3200 == 3200 notes in 1 seconds ...
attached 9600 contexted indicators in 361 seconds ...

4 indicators per leaf:

for note_count in [400, 800, 1600, 3200]:
    score = make_score(1, note_count)
    attach_indicators(score, [Foo, Bar, Baz, Qux])
    print()

made 1 * 400 == 400 notes in 0 seconds ...
attached 1600 contexted indicators in 8 seconds ...

made 1 * 800 == 800 notes in 0 seconds ...
attached 3200 contexted indicators in 32 seconds ...

made 1 * 1600 == 1600 notes in 0 seconds ...
attached 6400 contexted indicators in 131 seconds ...

made 1 * 3200 == 3200 notes in 1 seconds ...
???

Observation: runtime increases linearly with indicators-per-leaf.

Observation: runtime increases exponentially with leaves-per-voice.

trevorbaca commented 2 years ago

With removal of _warn_duplicate_indicator():

print("set 1 ...\n")

for note_count in [400, 800, 1600, 3200]:
    score = make_score(1, note_count)
    attach_indicators(score, [Foo])
    print()

print("set 2 ...\n")

for note_count in [400, 800, 1600, 3200]:
    score = make_score(1, note_count)
    attach_indicators(score, [Foo, Bar])
    print()

print("set 3 ...\n")

for note_count in [400, 800, 1600, 3200]:
    score = make_score(1, note_count)
    attach_indicators(score, [Foo, Bar, Baz])
    print()

print("set 4 ...\n")

for note_count in [400, 800, 1600, 3200]:
    score = make_score(1, note_count)
    attach_indicators(score, [Foo, Bar, Baz, Qux])
    print()

set 1 ...

made 1 * 400 == 400 notes in 0 seconds ...
attached 400 contexted indicators in 0 seconds ...

made 1 * 800 == 800 notes in 0 seconds ...
attached 800 contexted indicators in 0 seconds ...

made 1 * 1600 == 1600 notes in 0 seconds ...
attached 1600 contexted indicators in 0 seconds ...

made 1 * 3200 == 3200 notes in 1 seconds ...
attached 3200 contexted indicators in 0 seconds ...

set 2 ...

made 1 * 400 == 400 notes in 0 seconds ...
attached 800 contexted indicators in 0 seconds ...

made 1 * 800 == 800 notes in 0 seconds ...
attached 1600 contexted indicators in 0 seconds ...

made 1 * 1600 == 1600 notes in 0 seconds ...
attached 3200 contexted indicators in 0 seconds ...

made 1 * 3200 == 3200 notes in 1 seconds ...
attached 6400 contexted indicators in 0 seconds ...

set 3 ...

made 1 * 400 == 400 notes in 0 seconds ...
attached 1200 contexted indicators in 0 seconds ...

made 1 * 800 == 800 notes in 0 seconds ...
attached 2400 contexted indicators in 0 seconds ...

made 1 * 1600 == 1600 notes in 0 seconds ...
attached 4800 contexted indicators in 0 seconds ...

made 1 * 3200 == 3200 notes in 1 seconds ...
attached 9600 contexted indicators in 1 seconds ...

set 4 ...

made 1 * 400 == 400 notes in 0 seconds ...
attached 1600 contexted indicators in 0 seconds ...

made 1 * 800 == 800 notes in 0 seconds ...
attached 3200 contexted indicators in 0 seconds ...

made 1 * 1600 == 1600 notes in 0 seconds ...
attached 6400 contexted indicators in 1 seconds ...

made 1 * 3200 == 3200 notes in 1 seconds ...
attached 12800 contexted indicators in 2 seconds ...

trevorbaca commented 2 years ago

Opened as Abjad #1380.

trevorbaca / baca

Optimize contexted indicator attach #52