rendering flaws and fixes

paul-shannon / slexil

Software Linking Elan Xml to Illuminated Language

MIT License

0 stars 1 forks source link

rendering flaws and fixes #4

Closed paul-shannon closed 5 years ago

paul-shannon commented 5 years ago

@davidjamesbeck You offer a magnificent bugs report!

1) the cells in the grid aren’t aligned (maybe the change to left-aligned cells I made to the CSS didn’t do it or wasn’t committed?) Good point. The css and javascript used in the rendering are now (but not for long) hosted by pshannon.net. They seem to be out of date. I will revisit them, include your left-alignment and then, given that they are both small, and because we want the downloaded html+sounds to be self-contained, needing no internet access, I propose to include them directly in the html file. Same for the audio icon. Your thoughts?

2) the colons in glosses get swapped out for hyphens (by the method that creates small caps?). We also should not have hyphens but n-dashes (–), it improves legibility (see the gloss of “smarrita” at the end) When I hand-edited the inferno eaf file I used hyphens. Shall we allow that, and then transform all hyphens in morpheme and morphemeGloss tiers into –? I will add a (initially failing) test to test_MorphemeGloss.py to capture the colon -> hyphen bug. Then fix.

3) Something happened to the “def” in the gloss of “la” I'll look into that.

4) We should add single quotes around the glosses if the authors haven’t provided them in the ELAN file. Should be easy to do this.

davidjamesbeck commented 5 years ago

1) Good idea, the download should be as self-contained as possible, and the user needs to be able to customize it themselves. 2) Yes, we should transform all hyphens in both lines of the glosses. Something that goes with the colon problem, I notice that a hyphen was inserted after the nouns to introduce the gender markers (forest-fem) but there shouldn't have been a hyphen there in the input (since the word above isn't divided selv-a, which you can do in Italian but not, say, in French or German). This "fem" really should be subscripted to the word—in the old application, I just ignored things like that and had authors do it by hand because the input from ELAN was very idiosyncratic (I don't think .eaf supports that level of formatting). If we just put it in small caps, we have leave it to users to tweak manually in the HTML.

paul-shannon commented 5 years ago

@davidjamesbeck Can I Tom Sawyer you into fixing these things?

the grammatical terms I used in the inferno demo
rendering all of them properly

This may be more approachable than you think. Only two files are involved:

slexil/MorphemeGloss.py
slexil/tests/test_MorphemeGloss.py

I strongly and enthusiastically recommend a test-driven development strategy. The idea is that, when you encounter a bug (as with the need to transform all hyphens into em-dash), you

write a test that demonstrates the bug, that is, a test which fails
fix the code so that the test passes
then run ALL the tests to make sure that, in fixing the bug, you did not break something else

I'd be ever so grateful if you could take on this task! If you can, that will free me up to get the downloads capability working - for the inferno demo data, and for the webpage.zip we create.

davidjamesbeck commented 5 years ago

Hi, Paul

I will make a start on this. By way of learning git, I made some local changes and committed them to the repo—I added a reference to the IJAL styleguide for glossing to morphemeGloss.py (and corrected a typo or two) and uploaded a file “abbreviations.txt”. It is in the slexil directory for the moment, but probably should eventually live in the demo directory, no? All this seems to have been successful.

I will try to work on the rendering. It may come in dribs and drabs, but in principle doesn’t sound terribly challenging (famous last words),

David

On Feb 23, 2019, at 4:52 PM, Paul Shannon notifications@github.com wrote:

@davidjamesbeck https://github.com/davidjamesbeck Can I Tom Sawyer you into fixing these things?

the grammatical terms I used in the inferno demo rendering all of them properly This may be more approachable than you think. Only two files are involved:

slexil/MorphemeGloss.py slexil/tests/test_MorphemeGloss.py I strongly and enthusiastically recommend a test-driven development https://en.wikipedia.org/wiki/Test-driven_development strategy. The idea is that, when you encounter a bug (as with the need to transform all hyphens into em-dash), you

write a test that demonstrates the bug, that is, a test which fails fix the code so that the test passes then run ALL the tests to make sure that, in fixing the bug, you did not break something else I'd be ever so grateful if you could take on this task! If you can, that will free me up to get the downloads capability working - for the inferno demo data, and for the webpage.zip we create.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-466714073, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhkNhhizRAhuDSQtIJj9rA9ePhbJzks5vQdRNgaJpZM4a730Y.

paul-shannon commented 5 years ago

we should add single quotes around the glosses if the authors have not provided them Hi @davidjamesbeck I just did a fresh pull and ran the tests. test_Ijalline.py fails at line 85:

This line from the eaf

 <ANNOTATION_VALUE>‘[a] child, a woman as well.'</ANNOTATION_VALUE>

results in this, as seen in the debugger

(Pdb) x3.getTranslation()
"‘[a] child, a woman as well.'’"

The problem seems to hinge on the different between a ‘ and a ' - that is, two kinds of single quotes. Does the test also fail for you?

davidjamesbeck commented 5 years ago

Yes, but I put this problem down to author error—you shouldn’t use straight apostrophes for punctuation since they are a common symbol (for ʔ) in practical orthographies. I didn’t want to correct for this in the method for adding quotes because of the risk of removing something that was intended as phonological notation. That was just Lokono, I didn’t notice problems with the other texts, did you?

Is there a good Pandas tutorial anywhere? I found a few online but didn’t find them especially helpful.

David

On Mar 4, 2019, at 7:50 AM, Paul Shannon notifications@github.com wrote:

we should add single quotes around the glosses if the authors have not provided them Hi @davidjamesbeck https://github.com/davidjamesbeck I just did a fresh pull and ran the tests. test_Ijalline.py fails at line 85:

This line from the eaf
‘[a] child, a woman as well.'
results in this, as seen in the debugger

(Pdb) x3.getTranslation() "‘[a] child, a woman as well.'’" The problem seems to hinge on the different between a ‘ and a ' - that is, two kinds of single quotes. Does the test also fail for you?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469279524, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhpvAbPTcosUWglq2PHoNwstUDGS9ks5vTTLAgaJpZM4a730Y.

paul-shannon commented 5 years ago

Hi David,

Got it. How about we test for bad input, and devise a way to handle it in a way users will appreciate? Then detecting illegal input, as with the Lokono text, becomes part of our test suite. We want all the tests to pass.

webapp.py now validates against the schema, and tries to handle failure with a little bit of grace. If the user selects a .wav file rather than .eaf, then we can tell them. Further validation, detecting illegal characters or a malformed tierGuie should be done also, and a warning given (rather than a crashed webapp.py!)

Thinking about this just now, I realize that schema validation, and any other validation we add, should be moved to the constructor (“__init___”) of the Text class, in text.py.

Slexil uses pandas only for the DataFrame data structure it provides. It’s a dreadful, complicated hack, inspired by R’s data.frame but awkward to use. I keep a list of “pandas dataframe tips” in my log file, for reference: I never remember the details from one session to the next.

Here are some tutorials:

https://towardsdatascience.com/pandas-dataframe-a-lightweight-intro-680e3a212b96 https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm https://mode.com/python-tutorial/pandas-dataframe/

Paul

On Mar 4, 2019, at 7:16 AM, David Beck notifications@github.com wrote:

Yes, but I put this problem down to author error—you shouldn’t use straight apostrophes for punctuation since they are a common symbol (for ʔ) in practical orthographies. I didn’t want to correct for this in the method for adding quotes because of the risk of removing something that was intended as phonological notation. That was just Lokono, I didn’t notice problems with the other texts, did you?

Is there a good Pandas tutorial anywhere? I found a few online but didn’t find them especially helpful.

David

On Mar 4, 2019, at 7:50 AM, Paul Shannon notifications@github.com wrote:

we should add single quotes around the glosses if the authors have not provided them Hi @davidjamesbeck https://github.com/davidjamesbeck I just did a fresh pull and ran the tests. test_Ijalline.py fails at line 85:

This line from the eaf
‘[a] child, a woman as well.'
results in this, as seen in the debugger

(Pdb) x3.getTranslation() "‘[a] child, a woman as well.'’" The problem seems to hinge on the different between a ‘ and a ' - that is, two kinds of single quotes. Does the test also fail for you?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469279524, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhpvAbPTcosUWglq2PHoNwstUDGS9ks5vTTLAgaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

davidjamesbeck commented 5 years ago

Hi, Paul

We can incrementally add test for certain kinds of bad input, though the possibilities are endless … . I too am all for not making anything like the straight apostrophe a fatal error, since the workflow will almost certainly involve checking the HTML output, tweaking the EAF, and generating a new page. But warnings will give users a head’s up about things to tweak.

So you wouldn’t recommend Pandas as a general way of dealing with table-like data? If not, I won’t bother (especially if you aren’t bothered by the try: —except techniques). But I do use tables for other projects, so if Pandas offers advantages I might be interested in learning about it. Though my first impression is that it is kind of awkward and not that transparent.

David

On Mar 4, 2019, at 8:39 AM, Paul Shannon notifications@github.com wrote:

Hi David,

Got it. How about we test for bad input, and devise a way to handle it in a way users will appreciate? Then detecting illegal input, as with the Lokono text, becomes part of our test suite. We want all the tests to pass.

webapp.py now validates against the schema, and tries to handle failure with a little bit of grace. If the user selects a .wav file rather than .eaf, then we can tell them. Further validation, detecting illegal characters or a malformed tierGuie should be done also, and a warning given (rather than a crashed webapp.py!)

Thinking about this just now, I realize that schema validation, and any other validation we add, should be moved to the constructor (“__init___”) of the Text class, in text.py.

Slexil uses pandas only for the DataFrame data structure it provides. It’s a dreadful, complicated hack, inspired by R’s data.frame but awkward to use. I keep a list of “pandas dataframe tips” in my log file, for reference: I never remember the details from one session to the next.

Here are some tutorials:

https://towardsdatascience.com/pandas-dataframe-a-lightweight-intro-680e3a212b96 https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm https://mode.com/python-tutorial/pandas-dataframe/

Paul

On Mar 4, 2019, at 7:16 AM, David Beck notifications@github.com wrote:

Yes, but I put this problem down to author error—you shouldn’t use straight apostrophes for punctuation since they are a common symbol (for ʔ) in practical orthographies. I didn’t want to correct for this in the method for adding quotes because of the risk of removing something that was intended as phonological notation. That was just Lokono, I didn’t notice problems with the other texts, did you?

Is there a good Pandas tutorial anywhere? I found a few online but didn’t find them especially helpful.

David

On Mar 4, 2019, at 7:50 AM, Paul Shannon notifications@github.com wrote:

we should add single quotes around the glosses if the authors have not provided them Hi @davidjamesbeck https://github.com/davidjamesbeck I just did a fresh pull and ran the tests. test_Ijalline.py fails at line 85:

This line from the eaf
‘[a] child, a woman as well.'
results in this, as seen in the debugger

(Pdb) x3.getTranslation() "‘[a] child, a woman as well.'’" The problem seems to hinge on the different between a ‘ and a ' - that is, two kinds of single quotes. Does the test also fail for you?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469279524, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhpvAbPTcosUWglq2PHoNwstUDGS9ks5vTTLAgaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469298333, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhv_XhRmU9Kf-5RoOktU-5WohF-_Vks5vTT4lgaJpZM4a730Y.

paul-shannon commented 5 years ago

In python, alas, Pandas DataTable is the best we can do.

I am a little worried about the try/except approach to parsing. It opens the door to “Ptolemaic programming” where we add special case code to accommodate edge cases. Can we handle the 2, 5 and 6-line texts through the tierGuide?

On Mar 4, 2019, at 7:48 AM, David Beck notifications@github.com wrote:

Hi, Paul

We can incrementally add test for certain kinds of bad input, though the possibilities are endless … . I too am all for not making anything like the straight apostrophe a fatal error, since the workflow will almost certainly involve checking the HTML output, tweaking the EAF, and generating a new page. But warnings will give users a head’s up about things to tweak.

So you wouldn’t recommend Pandas as a general way of dealing with table-like data? If not, I won’t bother (especially if you aren’t bothered by the try: —except techniques). But I do use tables for other projects, so if Pandas offers advantages I might be interested in learning about it. Though my first impression is that it is kind of awkward and not that transparent.

David

On Mar 4, 2019, at 8:39 AM, Paul Shannon notifications@github.com wrote:

Hi David,

Got it. How about we test for bad input, and devise a way to handle it in a way users will appreciate? Then detecting illegal input, as with the Lokono text, becomes part of our test suite. We want all the tests to pass.

webapp.py now validates against the schema, and tries to handle failure with a little bit of grace. If the user selects a .wav file rather than .eaf, then we can tell them. Further validation, detecting illegal characters or a malformed tierGuie should be done also, and a warning given (rather than a crashed webapp.py!)

Thinking about this just now, I realize that schema validation, and any other validation we add, should be moved to the constructor (“__init___”) of the Text class, in text.py.

Slexil uses pandas only for the DataFrame data structure it provides. It’s a dreadful, complicated hack, inspired by R’s data.frame but awkward to use. I keep a list of “pandas dataframe tips” in my log file, for reference: I never remember the details from one session to the next.

Here are some tutorials:

https://towardsdatascience.com/pandas-dataframe-a-lightweight-intro-680e3a212b96 https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm https://mode.com/python-tutorial/pandas-dataframe/

Paul

On Mar 4, 2019, at 7:16 AM, David Beck notifications@github.com wrote:

Yes, but I put this problem down to author error—you shouldn’t use straight apostrophes for punctuation since they are a common symbol (for ʔ) in practical orthographies. I didn’t want to correct for this in the method for adding quotes because of the risk of removing something that was intended as phonological notation. That was just Lokono, I didn’t notice problems with the other texts, did you?

Is there a good Pandas tutorial anywhere? I found a few online but didn’t find them especially helpful.

David

On Mar 4, 2019, at 7:50 AM, Paul Shannon notifications@github.com wrote:

we should add single quotes around the glosses if the authors have not provided them Hi @davidjamesbeck https://github.com/davidjamesbeck I just did a fresh pull and ran the tests. test_Ijalline.py fails at line 85:

This line from the eaf
‘[a] child, a woman as well.'
results in this, as seen in the debugger

(Pdb) x3.getTranslation() "‘[a] child, a woman as well.'’" The problem seems to hinge on the different between a ‘ and a ' - that is, two kinds of single quotes. Does the test also fail for you?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469279524, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhpvAbPTcosUWglq2PHoNwstUDGS9ks5vTTLAgaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469298333, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhv_XhRmU9Kf-5RoOktU-5WohF-_Vks5vTT4lgaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

davidjamesbeck commented 5 years ago

There are two places where I did try/except — one is in ijalLine.py.parse() where the method needs to determine if there is a second translation (translation2) or a second transcription (transcription2) category and, if so, need to set the values for self.transcription2 and self.translation2. I didn’t know how to ask if those categories existed in self.tbl without throwing an exception. I’m sure you know how to fix that.

The second place is text.py.getTierSummary(). Here, having a None value for the morpheme and morphemeGloss categories in the TierGuide threw an exception (that is, for the 2 line model—test_text_Chatino_2Line.py). That is probably solved by abandoning the requirement that the TierGuide always have those two categories. And, in fact, the need for them might just be because of the assertions in the test file (so might not be a problem with webapp.py). That was the first test I played with, before I really got the hang of the tests and I may not have clued in to what was going on. So, this too is probably a trivial fix.

David

On Mar 4, 2019, at 8:52 AM, Paul Shannon notifications@github.com wrote:

In python, alas, Pandas DataTable is the best we can do.

I am a little worried about the try/except approach to parsing. It opens the door to “Ptolemaic programming” where we add special case code to accommodate edge cases. Can we handle the 2, 5 and 6-line texts through the tierGuide?

On Mar 4, 2019, at 7:48 AM, David Beck notifications@github.com wrote:

Hi, Paul

We can incrementally add test for certain kinds of bad input, though the possibilities are endless … . I too am all for not making anything like the straight apostrophe a fatal error, since the workflow will almost certainly involve checking the HTML output, tweaking the EAF, and generating a new page. But warnings will give users a head’s up about things to tweak.

So you wouldn’t recommend Pandas as a general way of dealing with table-like data? If not, I won’t bother (especially if you aren’t bothered by the try: —except techniques). But I do use tables for other projects, so if Pandas offers advantages I might be interested in learning about it. Though my first impression is that it is kind of awkward and not that transparent.

David

On Mar 4, 2019, at 8:39 AM, Paul Shannon notifications@github.com wrote:

Hi David,

Got it. How about we test for bad input, and devise a way to handle it in a way users will appreciate? Then detecting illegal input, as with the Lokono text, becomes part of our test suite. We want all the tests to pass.

webapp.py now validates against the schema, and tries to handle failure with a little bit of grace. If the user selects a .wav file rather than .eaf, then we can tell them. Further validation, detecting illegal characters or a malformed tierGuie should be done also, and a warning given (rather than a crashed webapp.py!)

Thinking about this just now, I realize that schema validation, and any other validation we add, should be moved to the constructor (“__init___”) of the Text class, in text.py.

Slexil uses pandas only for the DataFrame data structure it provides. It’s a dreadful, complicated hack, inspired by R’s data.frame but awkward to use. I keep a list of “pandas dataframe tips” in my log file, for reference: I never remember the details from one session to the next.

Here are some tutorials:

https://towardsdatascience.com/pandas-dataframe-a-lightweight-intro-680e3a212b96 https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm https://mode.com/python-tutorial/pandas-dataframe/

Paul

On Mar 4, 2019, at 7:16 AM, David Beck notifications@github.com wrote:

Yes, but I put this problem down to author error—you shouldn’t use straight apostrophes for punctuation since they are a common symbol (for ʔ) in practical orthographies. I didn’t want to correct for this in the method for adding quotes because of the risk of removing something that was intended as phonological notation. That was just Lokono, I didn’t notice problems with the other texts, did you?

Is there a good Pandas tutorial anywhere? I found a few online but didn’t find them especially helpful.

David

On Mar 4, 2019, at 7:50 AM, Paul Shannon notifications@github.com wrote:

we should add single quotes around the glosses if the authors have not provided them Hi @davidjamesbeck https://github.com/davidjamesbeck I just did a fresh pull and ran the tests. test_Ijalline.py fails at line 85:

This line from the eaf
‘[a] child, a woman as well.'
results in this, as seen in the debugger

(Pdb) x3.getTranslation() "‘[a] child, a woman as well.'’" The problem seems to hinge on the different between a ‘ and a ' - that is, two kinds of single quotes. Does the test also fail for you?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469279524, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhpvAbPTcosUWglq2PHoNwstUDGS9ks5vTTLAgaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469298333, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhv_XhRmU9Kf-5RoOktU-5WohF-_Vks5vTT4lgaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469303660, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhhG3i572dCwaqoUyrwnBoT50i8hsks5vTUFagaJpZM4a730Y.

paul-shannon commented 5 years ago

Can we make the tierGuide the single place where per-text structure variety is captured? That was what I have been trying to do.

On Mar 4, 2019, at 8:03 AM, David Beck notifications@github.com wrote:

There are two places where I did try/except — one is in ijalLine.py.parse() where the method needs to determine if there is a second translation (translation2) or a second transcription (transcription2) category and, if so, need to set the values for self.transcription2 and self.translation2. I didn’t know how to ask if those categories existed in self.tbl without throwing an exception. I’m sure you know how to fix that.

The second place is text.py.getTierSummary(). Here, having a None value for the morpheme and morphemeGloss categories in the TierGuide threw an exception (that is, for the 2 line model—test_text_Chatino_2Line.py). That is probably solved by abandoning the requirement that the TierGuide always have those two categories. And, in fact, the need for them might just be because of the assertions in the test file (so might not be a problem with webapp.py). That was the first test I played with, before I really got the hang of the tests and I may not have clued in to what was going on. So, this too is probably a trivial fix.

David

On Mar 4, 2019, at 8:52 AM, Paul Shannon notifications@github.com wrote:

In python, alas, Pandas DataTable is the best we can do.

I am a little worried about the try/except approach to parsing. It opens the door to “Ptolemaic programming” where we add special case code to accommodate edge cases. Can we handle the 2, 5 and 6-line texts through the tierGuide?

On Mar 4, 2019, at 7:48 AM, David Beck notifications@github.com wrote:

Hi, Paul

We can incrementally add test for certain kinds of bad input, though the possibilities are endless … . I too am all for not making anything like the straight apostrophe a fatal error, since the workflow will almost certainly involve checking the HTML output, tweaking the EAF, and generating a new page. But warnings will give users a head’s up about things to tweak.

So you wouldn’t recommend Pandas as a general way of dealing with table-like data? If not, I won’t bother (especially if you aren’t bothered by the try: —except techniques). But I do use tables for other projects, so if Pandas offers advantages I might be interested in learning about it. Though my first impression is that it is kind of awkward and not that transparent.

David

On Mar 4, 2019, at 8:39 AM, Paul Shannon notifications@github.com wrote:

Hi David,

Got it. How about we test for bad input, and devise a way to handle it in a way users will appreciate? Then detecting illegal input, as with the Lokono text, becomes part of our test suite. We want all the tests to pass.

webapp.py now validates against the schema, and tries to handle failure with a little bit of grace. If the user selects a .wav file rather than .eaf, then we can tell them. Further validation, detecting illegal characters or a malformed tierGuie should be done also, and a warning given (rather than a crashed webapp.py!)

Thinking about this just now, I realize that schema validation, and any other validation we add, should be moved to the constructor (“__init___”) of the Text class, in text.py.

Slexil uses pandas only for the DataFrame data structure it provides. It’s a dreadful, complicated hack, inspired by R’s data.frame but awkward to use. I keep a list of “pandas dataframe tips” in my log file, for reference: I never remember the details from one session to the next.

Here are some tutorials:

https://towardsdatascience.com/pandas-dataframe-a-lightweight-intro-680e3a212b96 https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm https://mode.com/python-tutorial/pandas-dataframe/

Paul

On Mar 4, 2019, at 7:16 AM, David Beck notifications@github.com wrote:

Yes, but I put this problem down to author error—you shouldn’t use straight apostrophes for punctuation since they are a common symbol (for ʔ) in practical orthographies. I didn’t want to correct for this in the method for adding quotes because of the risk of removing something that was intended as phonological notation. That was just Lokono, I didn’t notice problems with the other texts, did you?

Is there a good Pandas tutorial anywhere? I found a few online but didn’t find them especially helpful.

David

On Mar 4, 2019, at 7:50 AM, Paul Shannon notifications@github.com wrote:

we should add single quotes around the glosses if the authors have not provided them Hi @davidjamesbeck https://github.com/davidjamesbeck I just did a fresh pull and ran the tests. test_Ijalline.py fails at line 85:

This line from the eaf
‘[a] child, a woman as well.'
results in this, as seen in the debugger

(Pdb) x3.getTranslation() "‘[a] child, a woman as well.'’" The problem seems to hinge on the different between a ‘ and a ' - that is, two kinds of single quotes. Does the test also fail for you?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469279524, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhpvAbPTcosUWglq2PHoNwstUDGS9ks5vTTLAgaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469298333, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhv_XhRmU9Kf-5RoOktU-5WohF-_Vks5vTT4lgaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paul-shannon/slexil/issues/4#issuecomment-469303660, or mute the thread https://github.com/notifications/unsubscribe-auth/ApvDhhG3i572dCwaqoUyrwnBoT50i8hsks5vTUFagaJpZM4a730Y.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

davidjamesbeck commented 5 years ago

That would work, though we will still need ijalLine.py to query the TierGuide (or the data frame built from TierGuide) as to what is and is not in the structure. So problem #1 is what it is and has to be fixed sooner or later.

I’m not sure about problem #2 (what happens with the null tiers). If we allowed for a TierGuide with only 3 lines (speech and translation, plus morphemePacking), then null tiers are a non-issue for the getTierSummary() method. Let me check to see why the null tiers are needed in the test files.

David

On Mar 4, 2019, at 9:15 AM, Paul Shannon notifications@github.com wrote:

t