CUNY-CL / latin_scansion

Apache License 2.0
0 stars 2 forks source link

Merge markups #90

Open jillianchang opened 3 years ago

jillianchang commented 3 years ago

This PR integrates the markup branch into the master branch.

jillianchang commented 3 years ago

Can #70 be visited in order to apply the variable markup?

kylebgorman commented 3 years ago

Can #70 be visited in order to apply the variable markup?

I'm not sure what you mean re: #70. It seems like everything is working with respect to that and you updated the tests.

kylebgorman commented 3 years ago

This all looks good to me. My only thought is that do we want to keep the old and new behavior in separate grammars so users can choose to have variable markup or not? Just thinking out loud.

Even if we do this maybe we should just merge now.

It doesn't seem like the CircleCI tests are running when you submit (they run for me). I'm not sure what that's about.

jillianchang commented 3 years ago

Can #70 be visited in order to apply the variable markup?

I'm not sure what you mean re: #70. It seems like everything is working with respect to that and you updated the tests.


__________________________ ScansionTest.test_aen_1_1 ___________________________

self =

def test_aen_1_1(self):
    text = "Arma virumque canō, Trojae quī prīmus ab ōris"
    verse = self.scan_verse(text, 1)
    self.assertEqual(verse.number, 1)
    self.assertEqual(verse.text, text)
    self.assertEqual(
        verse.norm, "arma virumque canō trojae quī prīmus ab ōris"
    )
    self.assertEqual(
        verse.raw_pron, "arma wirũːkwe kanoː trojjaj kwiː priːmus ab oːris"
    )
    self.assertEqual(
        verse.var_pron, "arma wirũːkwe kanoː trojjaj kwiː priːmus‿ab‿oːris"
    )

    # Tests foot structures.
    self.assertEqual(verse.foot[0].type, latin_scansion.Foot.DACTYL)
    self.assertEqual(verse.foot[1].type, latin_scansion.Foot.DACTYL)
    self.assertEqual(verse.foot[2].type, latin_scansion.Foot.SPONDEE)
    self.assertEqual(verse.foot[3].type, latin_scansion.Foot.SPONDEE)
    self.assertEqual(verse.foot[4].type, latin_scansion.Foot.DACTYL)
    self.assertEqual(verse.foot[5].type, latin_scansion.Foot.SPONDEE)

    # Tests syllable weights.
    self.assertEqual(
        verse.foot[0].syllable[0].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[0].syllable[1].weight, latin_scansion.Syllable.LIGHT
    )
    self.assertEqual(
        verse.foot[0].syllable[2].weight, latin_scansion.Syllable.LIGHT
    )
    self.assertEqual(
        verse.foot[1].syllable[0].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[1].syllable[1].weight, latin_scansion.Syllable.LIGHT
    )
    self.assertEqual(
        verse.foot[1].syllable[2].weight, latin_scansion.Syllable.LIGHT
    )
    self.assertEqual(
        verse.foot[2].syllable[0].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[2].syllable[1].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[3].syllable[0].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[3].syllable[1].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[4].syllable[0].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[4].syllable[1].weight, latin_scansion.Syllable.LIGHT
    )
    self.assertEqual(
        verse.foot[4].syllable[2].weight, latin_scansion.Syllable.LIGHT
    )
    self.assertEqual(
        verse.foot[2].syllable[0].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[5].syllable[1].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[5].syllable[0].weight, latin_scansion.Syllable.HEAVY
    )
    self.assertEqual(
        verse.foot[5].syllable[1].weight, latin_scansion.Syllable.HEAVY
    )

    # Tests subsyllabic units.
    self.assertEqual(verse.foot[0].syllable[0].nucleus, "a")
    self.assertEqual(verse.foot[0].syllable[0].coda, "r")
    self.assertEqual(verse.foot[0].syllable[1].onset, "m")
    self.assertEqual(verse.foot[0].syllable[1].nucleus, "a")
    self.assertEqual(verse.foot[0].syllable[2].onset, "w")
    self.assertEqual(verse.foot[0].syllable[2].nucleus, "i")
    self.assertEqual(verse.foot[1].syllable[0].onset, "r")
    self.assertEqual(verse.foot[1].syllable[0].nucleus, "ũː")
    self.assertEqual(verse.foot[1].syllable[1].onset, "kw")
    self.assertEqual(verse.foot[1].syllable[1].nucleus, "e")
    self.assertEqual(verse.foot[1].syllable[2].onset, "k")
    self.assertEqual(verse.foot[1].syllable[2].nucleus, "a")
    self.assertEqual(verse.foot[2].syllable[0].onset, "n")
    self.assertEqual(verse.foot[2].syllable[0].nucleus, "oː")
    self.assertEqual(verse.foot[2].syllable[1].onset, "tr")
    self.assertEqual(verse.foot[2].syllable[1].nucleus, "o")
    self.assertEqual(verse.foot[2].syllable[1].coda, "j")
    self.assertEqual(verse.foot[3].syllable[0].onset, "j")
    self.assertEqual(verse.foot[3].syllable[0].nucleus, "a")
    self.assertEqual(verse.foot[3].syllable[0].coda, "j")
    self.assertEqual(verse.foot[3].syllable[1].onset, "kw")
    self.assertEqual(verse.foot[3].syllable[1].nucleus, "iː")
    self.assertEqual(verse.foot[4].syllable[0].onset, "pr")
    self.assertEqual(verse.foot[4].syllable[0].nucleus, "iː")
    self.assertEqual(verse.foot[4].syllable[1].onset, "m")
    self.assertEqual(verse.foot[4].syllable[1].nucleus, "u")
  self.assertEqual(verse.foot[4].syllable[2].onset, "s")

E AssertionError: '‿' != 's' E - ‿ E + s

tests/scansion_test.py:131: AssertionError


As per above, testing the feet and syllable structure fails because the backward composition doesn't yet use the variable markups.
kylebgorman commented 3 years ago

It seems like you should be able to just modify the tests? Sorry if I'm missing some detail. That whole backwards composition approach shouldn't care either way about the nature of the grammars...

On Mon, Oct 11, 2021 at 7:41 PM jillianchang @.***> wrote:

Can #70 https://github.com/CUNY-CL/latin_scansion/pull/70 be visited in order to apply the variable markup?

I'm not sure what you mean re: #70 https://github.com/CUNY-CL/latin_scansion/pull/70. It seems like everything is working with respect to that and you updated the tests.

__ ScansionTest.test_aen_1_1 ___

self =

def test_aen_1_1(self):

    text = "Arma virumque canō, Trojae quī prīmus ab ōris"

    verse = self.scan_verse(text, 1)

    self.assertEqual(verse.number, 1)

    self.assertEqual(verse.text, text)

    self.assertEqual(

        verse.norm, "arma virumque canō trojae quī prīmus ab ōris"

    )

    self.assertEqual(

        verse.raw_pron, "arma wirũːkwe kanoː trojjaj kwiː priːmus ab oːris"

    )

    self.assertEqual(

        verse.var_pron, "arma wirũːkwe kanoː trojjaj kwiː priːmus‿ab‿oːris"

    )

    # Tests foot structures.

    self.assertEqual(verse.foot[0].type, latin_scansion.Foot.DACTYL)

    self.assertEqual(verse.foot[1].type, latin_scansion.Foot.DACTYL)

    self.assertEqual(verse.foot[2].type, latin_scansion.Foot.SPONDEE)

    self.assertEqual(verse.foot[3].type, latin_scansion.Foot.SPONDEE)

    self.assertEqual(verse.foot[4].type, latin_scansion.Foot.DACTYL)

    self.assertEqual(verse.foot[5].type, latin_scansion.Foot.SPONDEE)

    # Tests syllable weights.

    self.assertEqual(

        verse.foot[0].syllable[0].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[0].syllable[1].weight, latin_scansion.Syllable.LIGHT

    )

    self.assertEqual(

        verse.foot[0].syllable[2].weight, latin_scansion.Syllable.LIGHT

    )

    self.assertEqual(

        verse.foot[1].syllable[0].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[1].syllable[1].weight, latin_scansion.Syllable.LIGHT

    )

    self.assertEqual(

        verse.foot[1].syllable[2].weight, latin_scansion.Syllable.LIGHT

    )

    self.assertEqual(

        verse.foot[2].syllable[0].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[2].syllable[1].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[3].syllable[0].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[3].syllable[1].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[4].syllable[0].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[4].syllable[1].weight, latin_scansion.Syllable.LIGHT

    )

    self.assertEqual(

        verse.foot[4].syllable[2].weight, latin_scansion.Syllable.LIGHT

    )

    self.assertEqual(

        verse.foot[2].syllable[0].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[5].syllable[1].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[5].syllable[0].weight, latin_scansion.Syllable.HEAVY

    )

    self.assertEqual(

        verse.foot[5].syllable[1].weight, latin_scansion.Syllable.HEAVY

    )

    # Tests subsyllabic units.

    self.assertEqual(verse.foot[0].syllable[0].nucleus, "a")

    self.assertEqual(verse.foot[0].syllable[0].coda, "r")

    self.assertEqual(verse.foot[0].syllable[1].onset, "m")

    self.assertEqual(verse.foot[0].syllable[1].nucleus, "a")

    self.assertEqual(verse.foot[0].syllable[2].onset, "w")

    self.assertEqual(verse.foot[0].syllable[2].nucleus, "i")

    self.assertEqual(verse.foot[1].syllable[0].onset, "r")

    self.assertEqual(verse.foot[1].syllable[0].nucleus, "ũː")

    self.assertEqual(verse.foot[1].syllable[1].onset, "kw")

    self.assertEqual(verse.foot[1].syllable[1].nucleus, "e")

    self.assertEqual(verse.foot[1].syllable[2].onset, "k")

    self.assertEqual(verse.foot[1].syllable[2].nucleus, "a")

    self.assertEqual(verse.foot[2].syllable[0].onset, "n")

    self.assertEqual(verse.foot[2].syllable[0].nucleus, "oː")

    self.assertEqual(verse.foot[2].syllable[1].onset, "tr")

    self.assertEqual(verse.foot[2].syllable[1].nucleus, "o")

    self.assertEqual(verse.foot[2].syllable[1].coda, "j")

    self.assertEqual(verse.foot[3].syllable[0].onset, "j")

    self.assertEqual(verse.foot[3].syllable[0].nucleus, "a")

    self.assertEqual(verse.foot[3].syllable[0].coda, "j")

    self.assertEqual(verse.foot[3].syllable[1].onset, "kw")

    self.assertEqual(verse.foot[3].syllable[1].nucleus, "iː")

    self.assertEqual(verse.foot[4].syllable[0].onset, "pr")

    self.assertEqual(verse.foot[4].syllable[0].nucleus, "iː")

    self.assertEqual(verse.foot[4].syllable[1].onset, "m")

    self.assertEqual(verse.foot[4].syllable[1].nucleus, "u")
  self.assertEqual(verse.foot[4].syllable[2].onset, "s")

E AssertionError: '‿' != 's'

E - ‿

E + s

tests/scansion_test.py:131: AssertionError

As per above, testing the feet and syllable structure fails because the backward composition doesn't yet use the variable markups.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CUNY-CL/latin_scansion/pull/90#issuecomment-940520482, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OKYL5LRREVXEIFYDXLUGNY2FANCNFSM5FZGL3OA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jillianchang commented 3 years ago

The test results are saying that the tie is the onset, but wouldn’t we want the onset to be “s?”

On Oct 11, 2021, at 10:55 PM, Kyle Gorman @.***> wrote:

 It seems like you should be able to just modify the tests? Sorry if I'm missing some detail. That whole backwards composition approach shouldn't care either way about the nature of the grammars...

On Mon, Oct 11, 2021 at 7:41 PM jillianchang @.***> wrote:

Can #70 https://github.com/CUNY-CL/latin_scansion/pull/70 be visited in order to apply the variable markup?

I'm not sure what you mean re: #70 https://github.com/CUNY-CL/latin_scansion/pull/70. It seems like everything is working with respect to that and you updated the tests.

__ ScansionTest.test_aen_1_1 ___

self =

def test_aen_1_1(self):

text = "Arma virumque canō, Trojae quī prīmus ab ōris"

verse = self.scan_verse(text, 1)

self.assertEqual(verse.number, 1)

self.assertEqual(verse.text, text)

self.assertEqual(

verse.norm, "arma virumque canō trojae quī prīmus ab ōris"

)

self.assertEqual(

verse.raw_pron, "arma wirũːkwe kanoː trojjaj kwiː priːmus ab oːris"

)

self.assertEqual(

verse.var_pron, "arma wirũːkwe kanoː trojjaj kwiː priːmus‿ab‿oːris"

)

Tests foot structures.

self.assertEqual(verse.foot[0].type, latin_scansion.Foot.DACTYL)

self.assertEqual(verse.foot[1].type, latin_scansion.Foot.DACTYL)

self.assertEqual(verse.foot[2].type, latin_scansion.Foot.SPONDEE)

self.assertEqual(verse.foot[3].type, latin_scansion.Foot.SPONDEE)

self.assertEqual(verse.foot[4].type, latin_scansion.Foot.DACTYL)

self.assertEqual(verse.foot[5].type, latin_scansion.Foot.SPONDEE)

Tests syllable weights.

self.assertEqual(

verse.foot[0].syllable[0].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[0].syllable[1].weight, latin_scansion.Syllable.LIGHT

)

self.assertEqual(

verse.foot[0].syllable[2].weight, latin_scansion.Syllable.LIGHT

)

self.assertEqual(

verse.foot[1].syllable[0].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[1].syllable[1].weight, latin_scansion.Syllable.LIGHT

)

self.assertEqual(

verse.foot[1].syllable[2].weight, latin_scansion.Syllable.LIGHT

)

self.assertEqual(

verse.foot[2].syllable[0].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[2].syllable[1].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[3].syllable[0].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[3].syllable[1].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[4].syllable[0].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[4].syllable[1].weight, latin_scansion.Syllable.LIGHT

)

self.assertEqual(

verse.foot[4].syllable[2].weight, latin_scansion.Syllable.LIGHT

)

self.assertEqual(

verse.foot[2].syllable[0].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[5].syllable[1].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[5].syllable[0].weight, latin_scansion.Syllable.HEAVY

)

self.assertEqual(

verse.foot[5].syllable[1].weight, latin_scansion.Syllable.HEAVY

)

Tests subsyllabic units.

self.assertEqual(verse.foot[0].syllable[0].nucleus, "a")

self.assertEqual(verse.foot[0].syllable[0].coda, "r")

self.assertEqual(verse.foot[0].syllable[1].onset, "m")

self.assertEqual(verse.foot[0].syllable[1].nucleus, "a")

self.assertEqual(verse.foot[0].syllable[2].onset, "w")

self.assertEqual(verse.foot[0].syllable[2].nucleus, "i")

self.assertEqual(verse.foot[1].syllable[0].onset, "r")

self.assertEqual(verse.foot[1].syllable[0].nucleus, "ũː")

self.assertEqual(verse.foot[1].syllable[1].onset, "kw")

self.assertEqual(verse.foot[1].syllable[1].nucleus, "e")

self.assertEqual(verse.foot[1].syllable[2].onset, "k")

self.assertEqual(verse.foot[1].syllable[2].nucleus, "a")

self.assertEqual(verse.foot[2].syllable[0].onset, "n")

self.assertEqual(verse.foot[2].syllable[0].nucleus, "oː")

self.assertEqual(verse.foot[2].syllable[1].onset, "tr")

self.assertEqual(verse.foot[2].syllable[1].nucleus, "o")

self.assertEqual(verse.foot[2].syllable[1].coda, "j")

self.assertEqual(verse.foot[3].syllable[0].onset, "j")

self.assertEqual(verse.foot[3].syllable[0].nucleus, "a")

self.assertEqual(verse.foot[3].syllable[0].coda, "j")

self.assertEqual(verse.foot[3].syllable[1].onset, "kw")

self.assertEqual(verse.foot[3].syllable[1].nucleus, "iː")

self.assertEqual(verse.foot[4].syllable[0].onset, "pr")

self.assertEqual(verse.foot[4].syllable[0].nucleus, "iː")

self.assertEqual(verse.foot[4].syllable[1].onset, "m")

self.assertEqual(verse.foot[4].syllable[1].nucleus, "u")

self.assertEqual(verse.foot[4].syllable[2].onset, "s")

E AssertionError: '‿' != 's'

E - ‿

E + s

tests/scansion_test.py:131: AssertionError

As per above, testing the feet and syllable structure fails because the backward composition doesn't yet use the variable markups.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CUNY-CL/latin_scansion/pull/90#issuecomment-940520482, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OKYL5LRREVXEIFYDXLUGNY2FANCNFSM5FZGL3OA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

kylebgorman commented 3 years ago

The test results are saying that the tie is the onset, but wouldn’t we want the onset to be “s?”

We'd want that. How the markup characters are "syllabified" is kind of up to you, I suppose.

jillianchang commented 3 years ago

I'm actually quite stuck on why it's displaying the tie as the onset, instead of the letter itself. Would you be able to look at that when you get the chance?

kylebgorman commented 3 years ago

I'm actually quite stuck on why it's displaying the tie as the onset, instead of the letter itself. Would you be able to look at that when you get the chance?

I took a look and it's not at all clear to me either. Do any of the intermediate representations we generate in the textproto give a clue?

I am wondering if we should pause this branch dev and focus on other stuff. I should be able to get back to web stuff shortly---I made some progress earlier and just need a half-day of focused time...sorry for my slow response

jillianchang commented 3 years ago

Yeah, the textproto shows the tie as on the onset anywhere where there's resyllabification.

Okay, that's probably a good idea. This branch seems fine with regards to the actual scanning- it's just the parsing of the specific syllable elements where it's not quite accurate. I'm not sure how big of a big deal that is?

In the meantime, I can start adding comments to the textproto files.

kylebgorman commented 3 years ago

That should work. We just need to remember not to lose the comments going forward.

On Fri, Oct 22, 2021 at 8:51 PM jillianchang @.***> wrote:

Yeah, the textproto shows the tie as on the onset anywhere where there's resyllabification.

Okay, that's probably a good idea. This branch seems fine with regards to the actual scanning- it's just the parsing of the specific syllable elements where it's not quite accurate. I'm not sure how big of a big deal that is?

In the meantime, I can start adding comments to the textproto files.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CUNY-CL/latin_scansion/pull/90#issuecomment-950029991, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OIJSPEON4BQPERLQ63UIIBKDANCNFSM5FZGL3OA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jillianchang commented 2 years ago

Should we come back to this so that the web app can use markup pronunciation?

kylebgorman commented 2 years ago

We could come back to that at some point sure.

On Fri, Nov 26, 2021 at 8:48 PM jillianchang @.***> wrote:

Should we come back to this so that the web app can use markup pronunciation?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CUNY-CL/latin_scansion/pull/90#issuecomment-980484390, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OOW2Z3COELG3637O5LUOA2IJANCNFSM5FZGL3OA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jillianchang commented 2 years ago

I'm starting to investigate this further. My question is, how does it exactly determine the onset, nucleus, and coda of a syllable? (as generated by the textprotos). If we can look into the code that does that, maybe we can get a clue.

kylebgorman commented 2 years ago

See here. I have no clue how to translate this into useful pure-FST logic. It is some of the hardest code I've ever written.

One of the many motivations for creating Pynini is that sometimes we need to (or at least very much want to) mix declarative grammar-defining logic with general-purpose imperative logic that works on specific inputs, like you see there.