PhonologicalCorpusTools / SLPAA

5 stars 0 forks source link

234 write script to convert corpora saved in pre oct2023 format to new locationmovement tree format #268

Closed gracemyz closed 7 months ago

gracemyz commented 7 months ago

After this merge, coders will be able to directly switch to the main branch, and their selections should appear. They don't need to do anything different.

Added a few logging warnings:

I went through all of the corpora that were in the Google drive as of 2024-01-08 and there should be no more backward compatibility issues among them after this merge EXCEPT for one instance discussed in issue 195. However, there were several blank modules (will email Kathleen with a list just in case they weren't supposed to be blank)

kchall commented 7 months ago

Thanks, @gracemyz ! This is mostly looking good. Two issues:

  1. I'm a bit worried that there might still be some missing information. E.g. in the corpus 2023_03_14_ABBREVIATE_ACCIDENT_v3_AA_KCH_YG_KCH.slpaa from "checked files + checked contact," the sign ACCEPT has a Relation module. If I look at this module in the "frozen" October branch, I can see that there are specified handparts (specifically, fingers / tips for H1). But if I look at this in the 234 branch, this information is missing, and no warning is thrown.

(More generally @kvesik: I'm a bit worried about the saving features in the software. As I went through and checked several corpora for this review, elements that I had worked on previously and thought I saved seemed to be missing, even when other changes to the same corpus were present. We need to do some kind of proper audit. The above example is just one that is actually observably different between the frozen branch and the 234 branch.)

  1. It's great to have the warning messages, because no modules should actually be blank. But, would it be possible to specify in the warning messages (1) the type of a module and (2) the specific module number? e.g. for the error message for ANALYZE above, could it read:

WARNING:root:ANALYZE: Location Module has no selections. Is something missing?

or, even better:

WARNING:root:ANALYZE: Loc2 Module has no selections. Is something missing?

? That would help us actually manually fix the issues instead of having to open up each module to check if it's blank.

gracemyz commented 7 months ago

@kchall

  1. Sorry, I only checked the movement and location modules! My bad, I'll go back and handle the relation modules. Were the missing elements that you encountered all in the relation module, or were there others as well?
  2. Of course, I'll update to indicate the module type and module number
kchall commented 7 months ago

Thanks, @gracemyz ! I only noticed issues in relation modules, so hopefully that's okay. (But some of them were things where even when I opened them in the frozen branch, things were missing that I'm 95% sure I changed. So I think we need to do a larger audit regardless.)

Thanks so much for the addition of type and number!

kvesik commented 7 months ago

@kchall thanks for noting your concerns around saving! I will start a new issue relating to persistence of data entered / changed / converted.

gracemyz commented 7 months ago

Warning messages for Relation modules with no body part selections After today's meeting, we decided:

There may be cases where the Relation module uses both articulators, but only one of them is missing body part selections. For example, in the 2023-03-14 Abbreviate_Accident corpus, the body part specification panel for the sign ACCEPT looks like this: image In this case, I'll still output a warning message, even though the module is only half blank.

@kchall Let me know if the new empty Relation module warnings look okay, and if you're still seeing any missing information!

kchall commented 7 months ago

@gracemyz Thanks, Grace! This all basically looks great, and the issue with ACCEPT above does seem to be fixed (the information is now showing up in the module). But, I think the system might be over-warning. E.g. in 2023_04_04_ADD_v3_AFTER_v4_AA_KCH.slpaa (in the 'checked files' set), I get a warning:

WARNING:root: ADD (v.3) relation1: Module has no selections. Is something missing?

but there are selections:

image

...and the same is true for ADMIRE (v. 2) and ADVERTISE, in that same corpus -- all relation modules are throwing 'no selection' warnings, but they do all have selections.

I tried another corpus, 2023_04_19_ALCOHOL_ANGEL_AA_KCH_YG_KCH.slpaa (in the "checked files + checked contact" set), and get a similar issue for ALCOHOL Relation 1. But, Relation 2 for that same sign doesn't throw a warning. I'm wondering if the warning system is just checking whether something under 'Contact' has been selected, and is throwing the warning if that component is blank? It really should only throw the warning if the whole module is empty.

Edit: Oh, wait, no, that's not the right explanation. E.g. ACCEPT in 2023_03_14_ABBREVIATE_ACCIDENT_v3_AA_KCH_YG_KCH.slpaa (from "checked files + checked contact") is also throwing the 'no selection' warning, even though it has contact coded and in fact all its information is showing up:

image

Edit 2: I figured it out, ha ha! Re-reading your comment above, I realized that you're outputting a warning any time one (or both) of the body part selections is blank. This is not quite the desired behaviour, but I do see why you programmed it this way based on what I told you. Would it be possible to differentiate between e.g. the module "Relation1" as a whole and the "Specify handparts" dialogue that is embedded within it? So e.g. ADD v. 3 would say:

WARNING:root: ADD (v.3) relation1: Module has no handpart selections. Is something missing?

...rather than just throwing the error for the whole module?

gracemyz commented 7 months ago

@kchall I updated the Relation module, but I put "bodypart" instead of "handpart" since the specification panel allows for arm and leg parts as well.

Please let me know what I should change to get the desired behaviour šŸ˜… -- you mentioned that warnings should only be thrown if the whole module is empty, but for the Relation module, I'm currently only checking the body part specification panels. I'm not checking Contact, Direction of Relation, or Distance between X and Y.

On a side note - there are a couple other issues related to this one that I think can be closed once this is merged. Let me know if these look okay to close:

kchall commented 7 months ago

@gracemyz Thanks so much! I think the ideal behaviour would be:

  1. check to see whether the main Relation module has any specifications or is completely empty. Throw a warning if there's nothing visible on that page (no X, no Y, no Contact, no Direction of Relation, and no Distance between X and Y). That warning can say something like

WARNING:root: ADD (v.3) relation1: Main module has no selections. Is something missing?

  1. also check the body parts as you're doing, and then say something like

WARNING:root: ADD (v.3) relation1: Module has no bodypart selections. Is something missing?

If both happen to apply, then both warnings can be thrown.

Thank you!

gracemyz commented 7 months ago

@kchall I've updated with the main module check (warn if no X, no Y, no Contact, no Direction of Relation, and no Distance between X and Y). But, the Relation module already prevents the user from saving if there are missing X or Y selections - do you still want the main module warning message to check for that? The warning message will probably never appear.

kchall commented 7 months ago

@gracemyz Oh, good catch. It's okay then for it to just check the ones you have and throw a warning if all of those are empty. Thank you!

kvesik commented 7 months ago

I just ran the corpus update script on our friend ABBREVIATE_ACCIDENTv3 and got this:

image

iirc (and glancing at the code) there should be a detailed list of nodes/paths that have been revised, right?

gracemyz commented 7 months ago

The update_corpus script is outdated - I was planning to delete it before merging. Running main.py will revise all the paths.

On Mon, Jan 22, 2024, 8:18 a.m. kvesik @.***> wrote:

I just ran the corpus update script on our friend ABBREVIATE_ACCIDENTv3 and got this: image.png (view on web) https://github.com/PhonologicalCorpusTools/SLPAA/assets/33950744/8f12f063-63f3-4385-95cd-f494276e54cd

iirc (and glancing at the code) there should be a detailed list of nodes/paths that have been revised, right?

ā€” Reply to this email directly, view it on GitHub https://github.com/PhonologicalCorpusTools/SLPAA/pull/268#issuecomment-1904346218, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANEPQIETP5IIKDRVLK4PIB3YP2GNLAVCNFSM6AAAAABBXWRUHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBUGM2DMMRRHA . You are receiving this because you were mentioned.Message ID: @.***>

kvesik commented 7 months ago

Oh right, thank you, yes. Forgot we discussed that! Thanks @gracemyz :)

kvesik commented 7 months ago

On a side note - there are a couple other issues related to this one that I think can be closed once this is merged. Let me know if these look okay to close:

155, #190, and #193 all look good to me on this PR.

However, for #162, is this behaviour (below) expected?

  1. Create a corpus & sign (or load existing) in the oct2023 frozen branch.
  2. Open or create a location module and set the location to be "Arm (contralateral)".
  3. Save the corpus.
  4. Reopen in branch 234 --> terminal shows error output as below.

C:\Users\kvesik\AppData\Local\Programs\Python\Python311\python.exe C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\main.py Traceback (most recent call last): File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\main.py", line 8, in exit_code = appctxt.run() ^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\app.py", line 14, in run self.main_window.show() ^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\app.py", line 19, in main_window return MainWindow(self) ^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\main_window.py", line 342, in init self.open_initialization_window() File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\main_window.py", line 566, in open_initializationwindow response = initialization.exec() ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\initialization_dialog.py", line 40, in load_corpus_button.clicked.connect(lambda clicked: self.load_corpus(load_func, clicked)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\initialization_dialog.py", line 69, in load_corpus response = load_func(clicked) ^^^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\decorator.py", line 15, in wrapper_check_unsaved_change return func(self, event, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\main_window.py", line 907, in on_action_load_corpus self.corpus = self.load_corpus_binary(file_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\gui\main_window.py", line 868, in load_corpus_binary corpus = Corpus(serializedcorpus=renamed_load(f)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\lexicon\lexicon_classes.py", line 470, in init self.add_missing_paths() # Another backwards compatibility function for movement and location ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\lexicon\lexicon_classes.py", line 573, in add_missing_paths self.add_missing_paths_helper(gloss, module.locationtreemodel, type, count, correctionsdict) File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\lexicon\lexicon_classes.py", line 610, in add_missing_paths_helper paths_to_add = self.get_paths_to_add(oldpath, type) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kvesik\Documents\School\RA-KCH\GitHub\SLPAA\src\main\python\lexicon\lexicon_classes.py", line 720, in get_paths_to_add nodes[1] == 'Arm - contra'


IndexError: list index out of range
gracemyz commented 7 months ago

Thanks @kvesik for the catch - fixed !

kchall commented 7 months ago

@gracemyz I think this all looks good to go -- yay! After it's pushed, I will let the two coders currently working on the frozen branch that they can sync to main and tell them to look out for warning messages as they open corpora. THANK YOU!