lingdb / Sound-Comparisons

Exploring phonetic diversity across language families —
http://www.soundcomparisons.com
Other
13 stars 8 forks source link

Culture List Malakula #446

Open AvivaShimelman opened 7 years ago

AvivaShimelman commented 7 years ago

We now have a second list of items for the Malakula recordings. The list is by design unique to Malakula, with items like 'edible flesh of a sprouted coconut' 'tickle the belly of pig in a fashion that makes it lie down and sleep', 'hole in a woven bamboo wall through which the sun shines inside,' 'spit-out fiber of sugarcane', and 'take the face of a man whom you have killed' (all elicit simple terms), along with the slightly more universal 'father-in-law', 'breadfruit' and 'wean'. List attached. It includes 300 items, although I am recommending that we use only a subset (decision yet to come from Russell on this). In any case, the work of marking it up is the same. Items are numbered, but that can be ignored. Renumbering wouldn't throw any wrenches in it for me. Elicitation is always in the order listed.

A second issue is that coverage is a bit different. We have all the major speech varieties (45), but each is represented, for now, with a single recording. In the modal case, the speaker has also made a BV (basic vocabulary) list, but that is just because I preferred to work with "known entities" in recording this one. I privileged those speakers from the first round who offered the clearest "central representations" of languages, and not, as I had done generally with the BV recordings, the oldest speakers of marginal varieties. So tagging the new items on to existing recordings is not really an option. Either the first set of items will come up empty with the culture list or the second will come up empty for the BV list. Maybe a second map?

Malakula "culture list" items 24 02 2017.xlsx

AvivaShimelman commented 7 years ago

I forgot to mention - I have photos and short (5-20-second) videos for many items. Video shot in gorgeous, memory-gobbling 4K. A.

PaulHeggarty commented 7 years ago

Having spoken to Russell about this yesterday, he is happy for me to make some decisions on the practicalities of implementing this into SoundComparisons.

PaulHeggarty commented 7 years ago

Timetable for the Cultural list project.

AvivaShimelman commented 7 years ago

Sounds good to me. I'll make a "T/F" list for the items that I have photos and videos for. Short answer: I have photos for something like 120 items. If you want better coverage and uniformity is not an issue, I can do some google image searches.

A.

On 3/1/17, Paul Heggarty notifications@github.com wrote:

Having spoken to Russell about this yesterday, he is happy for me to make some decisions on the practicalities of implementing this into SoundComparisons.

  • I can confirm that the culture list will not be merged into the same site as the BV list. It will be a separate section of the site, initially I presume just a separate entry in the drop-down studies list (the best way to do this longer-term is to be determined with @Bibiko).
  • The Culture list section will have its own 'word' list (meanings rather than cognates) preferably, not a new language list, but just a sub-set of the same language varieties used for the BV list.
  • For me, it's no problem if the speakers are different for the same language variety, because there will be no page on which both the Basic and Cultural lists are mixed up together. Far more important is not to start getting conflicts and confusions between language varieties for the different lists, and not to add new language varieties.
  • What we could do, where a Culture list is available, is to include, in 1 Language table view only, a button to 'show also cultural vocabulary', as a second table under the main one. This could be useful, for example, for searching for more tokens of a given sound within a single language variety.
  • To include photos (and videos?), I think the best means would be one parallel to the one that we already have for speaker photos assigned to individual languages. That is, in the word selector column on the right, a mini-thumbnail would appear whenever there is a photo for that term. Clicking on it would show the full photo.
  • To that end, @AvivaShimelman, could you let us know whether you have photos for all items on the cultural list? They will also have to be given appropriate filenames, but wait out on that until we work out the list of exact names (no special characters) that we'll need.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-283295452

-- Aviva Shimelman, PhD

AvivaShimelman commented 7 years ago

All fine with me.

My part of the BV work is basically done. Transcriptions and pre-edited recordings now just await TLC from LW.

I have three more lists that I want to add to fill out the Larevat representation.

I want to do a full listen-through on the whole site, too, but that won't take more than 3 days or so on my part.

I think the work on transcribing and pre-editing the CL files for me is the same whether the site is ready to receive them or not. They can sit indefinitely. Depending on how much participation you will need from me when you do start trying to move that material onto the site, I would be willing to just do it "pro bono" or to sign some kind of quickie freelance contract.

A.

On 3/1/17, Paul Heggarty notifications@github.com wrote:

Timetable for the Cultural list project. Yes, preparation work, as far as possible, should be done while @AvivaShimelman and @LauraWae are still working on the project. However, adding a new list is not trivial at all in how much time it will demand from @PaulHeggarty (a week or so) and @Bibiko (more). Neither of us will get any such free time for many weeks, if not months. So this system will not be up and running for months. This also means that for @AvivaShimelman and @LauraWae too, more important is to finish off all work on the BV list, and any other general corrections and improvements to the site, as a higher priority than the Culture list. I presume, by the way, by this stage, that we will not be getting a Bislama translation done at any time remotely soon.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-283295747

-- Aviva Shimelman, PhD

PaulHeggarty commented 7 years ago

What is TLC, please? And do you mean the transcriptions are done, and just need to be uploaded? Or that you will start on the transcriptions once you hear from Laura?

The full listen-through sounds like a very good idea. In an ideal world, you would do this in 'editor mode' https://github.com/lingdb/Sound-Comparisons/issues/425, but that won’t be working in time, I fear.

Yes, you’re right: the work on transcribing and pre-editing the CL files for me is the same whether the site is ready to receive them or not. So please go ahead, but once the other things are done. The idea of a quickie freelance contract, as and when necessary, sounds good, too.

AvivaShimelman commented 7 years ago

"TLC" = "tender loving care" (I never would have thought that one was current only on this side of the pond!) Transcriptions are done and uploaded. Laura is already chipping away at them, I believe. So, I'll understand "once other things are done" as once I've finished the listen-through, too.

A.

On 3/2/17, Paul Heggarty notifications@github.com wrote:

What is TLC, please? And do you mean the transcriptions are done, and just need to be uploaded? Or that you will start on the transcriptions once you hear from Laura?

The full listen-through sounds like a very good idea. In an ideal world, you would do this in 'editor mode' https://github.com/lingdb/Sound-Comparisons/issues/425, but that won’t be working in time, I fear.

Yes, you’re right: the work on transcribing and pre-editing the CL files for me is the same whether the site is ready to receive them or not. So please go ahead, but once the other things are done. The idea of a quickie freelance contract, as and when necessary, sounds good, too.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-283595777

-- Aviva Shimelman, PhD

AvivaShimelman commented 7 years ago

@LauraWae I've created files on my OC for the CL selected recordings and transcriptions and shared them with you. I leave it to you to create (or not) folders on "Malakula" as well. Please have a look at them and let me know what questions you have. It might be easier to Skype, depending. The prompts are written in English and Bislama. In general, I went through the list with consultants before hand and transcribed their answers, so when we went to the recording, you'll hear me prompt not just the Bislama, but actually the responses, too.

A.

LauraWae commented 7 years ago

Hi @AvivaShimelman and @PaulHeggarty

I have created this Praat glossing-list for the basic vocabulary. I think it should work quite fine, and I am happy to hear feedback. Thanks!

uncle_mother_older uncel_father_older aunt_father_younger aunt_mother_younger cousin_boy_boy_younger cousin_boy_boy_older cousin_girl_boy_older cousin_boy_girl_younger grandson_son granddaughter_daughter nephew_brother nephew_sister niece_brother niece_sister sister_boy_older brother_girl_older brother_boy_younger father_in_law mother_in_law daughte_in_law not_nameable boy_firstborn pregnant_bride adopted_child wife_sister_first wife_sister_all levirate_wife respect share peace discussion consensus collective_work namangi lowest_grade_name chief fight resolve fine revenge help barter gift inherit part_of_landowner slave village gate dance_ground toilet_w toilet_m ritual_house_w nakamal post horizontal_beam rope_beams natangora_bamboo sunbeam_wall hole_in_wall instrument_laplap stone_in_laplap laplap_in_bamboo leftovers green_banana_feeling soot knife_clam_shell cup_coconut cut_before_cooking elephant_taro taro_swamp wild_yam wild_pig island_cabage banana breadfruit sugar_cane sugar_cane_fiber prepare_yam_garden mound_yam yam_sprouted yam_to_harvest first_yam stake_yam wilkin garden_last_year pig_eats_garden drag_fire stick_for_garden_digging coconut_flower coconut_before_food coconut_skin navara coconut_dry coconut_leaf_midrib coconut_leaf_frond coconut_milk strong_bamboo bamboo_ring bamboo_length bamboo_stand tree_fern hibiscus natangora_bark nangalat bangan_tree broken_wave tidal_wave swell deep_sea deepest_place driftwood wind_towards_sea wind_towards_bush cyclone sun_shower earthquake moon_waning moon_waxing time_before_dawn first_month wet_season coral dead_coral reef where_water_enters_sea star_constelation_to_plant_yam place_of_thick_vines indicate_trail_on_tree horizon fish_scale flyingfox_black flyingfox_white spider_web turtle_shell hawk owl maggot butterfly octopus fire_ant shark dolphin firefly gecko pig_teeth_twice pig_tooth_normal pig_castrated sow tickle_pig_belly hunt_birds_at_night sling arrow arrow_three_points arrow_miss_target bow bostring catch_shrimp_with_hand gather_seafood bambbo_trap axe_handle girl_growing_breasts girl_falling_breasts girl_to_be_married boy_before_nakamal circumcision_ceremony circumsize boy_circumcized shave mourning_non_core mourning_core dead_ceremony dead_ceremony_100_days land_of_the_dead wail bed_for_dead_body bury_dead unnatural_death cly_effigy grass_skirt nambas belt_made_of_bark belt nose_ring armband tattoo make_cord_by_rolling_fibers pay_bride_price reserve_a_bride pullout_teeth_of_bride dance_around_tamtam womens_dance ankelt_for_dancing stick_beating_tamtam slit_a_slit_gong handdrum conch panpipe mask tall_mask lisepsep stone_dancingground line_of_stones stone_with_face holy_place weave mat mat_for_baby basket weave_bambu finish_mat join_two_mats dye_leaves print_mat mat_from_coconut leaves_from_coconut_mat sanddrawing erase_sanddrawin cats_cradle hand_pinching_bird_naming paddle figure_head outrigger wood_connecting_canoe_and_outrigger canoe_hollow sail God deads_person_spirit spirit devil Creator crazy nightmare happy lazy skull nipple thumb lond_head defecate disembowel albino medicine poison doctor black_magic take_the_face_of_another_peson sore_on_the_bottom pick_lice crush_lice elephantitis midwife abortion adultress infertile womb menstruation pregnant give_birth wean love teach scold bounce_a_baby war warrior deadly_part_of_head carry_on_back carry_between_two_people show_teeth sit_to_keep_warm sit_legs_outstretched reach_hill walking_stick walk_about_at_night hit_shell taboos_cross shout_out_in_the_bush wooden_headrest day_after_tomorrow posion_fish_in_pool grate forget rollers_for_canoe bambu_to_knock its_ok sorry kava_tray kava broom pantanas lie cluck_chicks pick_all_fruit limp

AvivaShimelman commented 7 years ago

Halo!

Thanks for being on this. Whether or not it will work depends on whether it is an internal document (i.e., just for us to have columns to have constants to map audio files to transcriptions) or a public one (i.e., visitors to the site are going to see these.

A.

On Thu, Mar 30, 2017 at 10:39 AM, LauraWae notifications@github.com wrote:

Hi @AvivaShimelman https://github.com/AvivaShimelman and @PaulHeggarty https://github.com/PaulHeggarty

I have created this Praat glossing-list for the basic vocabulary. I think it should work quite fine, and I am happy to hear feedback. Thanks!

uncle_mother_older uncel_father_older aunt_father_younger aunt_mother_younger cousin_boy_boy_younger cousin_boy_boy_older cousin_girl_boy_older cousin_boy_girl_younger grandson_son granddaughter_daughter nephew_brother nephew_sister niece_brother niece_sister sister_boy_older brother_girl_older brother_boy_younger father_in_law mother_in_law daughte_in_law not_nameable boy_firstborn pregnant_bride adopted_child wife_sister_first wife_sister_all levirate_wife respect share peace discussion consensus collective_work namangi lowest_grade_name chief fight resolve fine revenge help barter gift inherit part_of_landowner slave village gate dance_ground toilet_w toilet_m ritual_house_w nakamal post horizontal_beam rope_beams natangora_bamboo sunbeam_wall hole_in_wall instrument_laplap stone_in_laplap laplap_in_bamboo leftovers green_banana_feeling soot knife_clam_shell cup_coconut cut_before_cooking elephant_taro taro_swamp wild_yam wild_pig island_cabage banana breadfruit sugar_cane sugar_cane_fiber prepare_yam_garden mound_yam yam_sprouted yam_to_harvest first_yam stake_yam wilkin garden_last_year pig_eats_garden drag_fire stick_for_garden_digging coconut_flower coconut_before_food coconut_skin navara coconut_dry coconut_leaf_midrib coconut_leaf_frond coconut_milk strong_bamboo bamboo_ring bamboo_length bamboo_stand tree_fern hibiscus natangora_bark nangalat bangan_tree broken_wave tidal_wave swell deep_sea deepest_place driftwood wind_towards_sea wind_towards_bush cyclone sun_shower earthquake moon_waning moon_waxing time_before_dawn first_month wet_season coral dead_coral reef where_water_enters_sea star_constelation_to_plant_yam place_of_thick_vines indicate_trail_on_tree horizon fish_scale flyingfox_black flyingfox_white spider_web turtle_shell hawk owl maggot butterfly octopus fire_ant shark dolphin firefly gecko pig_teeth_twice pig_tooth_normal pig_castrated sow tickle_pig_belly hunt_birds_at_night sling arrow arrow_three_points arrow_miss_target bow bostring catch_shrimp_with_hand gather_seafood bambbo_trap axe_handle girl_growing_breasts girl_falling_breasts girl_to_be_married boy_before_nakamal circumcision_ceremony circumsize boy_circumcized shave mourning_non_core mourning_core dead_ceremony dead_ceremony_100_days land_of_the_dead wail bed_for_dead_body bury_dead unnatural_death cly_effigy grass_skirt nambas belt_made_of_bark belt nose_ring armband tattoo make_cord_by_rolling_fibers pay_bride_price reserve_a_bride pullout_teeth_of_bride dance_around_tamtam womens_dance ankelt_for_dancing stick_beating_tamtam slit_a_slit_gong handdrum conch panpipe mask tall_mask lisepsep stone_dancingground line_of_stones stone_with_face holy_place weave mat mat_for_baby basket weave_bambu finish_mat join_two_mats dye_leaves print_mat mat_from_coconut leaves_from_coconut_mat sanddrawing erase_sanddrawin cats_cradle hand_pinching_bird_naming paddle figure_head outrigger wood_connecting_canoe_and_outrigger canoe_hollow sail God deads_person_spirit spirit devil Creator crazy nightmare happy lazy skull nipple thumb lond_head defecate disembowel albino medicine poison doctor black_magic take_the_face_of_another_peson sore_on_the_bottom pick_lice crush_lice elephantitis midwife abortion adultress infertile womb menstruation pregnant give_birth wean love teach scold bounce_a_baby war warrior deadly_part_of_head carry_on_back carry_between_two_people show_teeth sit_to_keep_warm sit_legs_outstretched reach_hill walking_stick walk_about_at_night hit_shell taboos_cross shout_out_in_the_bush wooden_headrest day_after_tomorrow posion_fish_in_pool grate forget rollers_for_canoe bambu_to_knock its_ok sorry kava_tray kava broom pantanas lie cluck_chicks pick_all_fruit limp

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-290431765, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKM1YMqltClNIfbFSpscG3r3XAc1_nks5rq762gaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

LauraWae commented 7 years ago

Hi, Thanks for replying so soon. This is only internal, for tagging in Praat (remember the words we filled in between boundaries? They can't have special characters and should be as simple and as specific as possible at the same time). All the best, Laura

AvivaShimelman commented 7 years ago

It'll be fine.

A

On Fri, Mar 31, 2017 at 6:22 AM, LauraWae notifications@github.com wrote:

Hi, Thanks for replying so soon. This is only internal, for tagging in Praat (remember the words we filled in between boundaries? They can't have special characters and should be as simple and as specific as possible at the same time). All the best, Laura

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-290676431, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKMzh6kq8sLSbvHxHqYZXJ5L8TwQkuks5rrNP_gaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

AvivaShimelman commented 7 years ago

@PaulHeggarty How do you want to deal with the metadata for the CL recordings? Do you want me to make a new "For pasting" sheet based on the one we currently use for the BV lists?

A.

AvivaShimelman commented 7 years ago

@LauraWae Please give me a heads-up when you want to start in on the CLs. I'm kind of piling them up here until we figure out exactly how we want to organize ourselves.

A.

PaulHeggarty commented 7 years ago

@AvivaShimelman Aren’t the languages for the culture list just a subset of those we already have for the basic vocabulary list? We only need additions to the for pasting sheet if you want to add wholly new languages.

PaulHeggarty commented 7 years ago

As for the timing on the CLs, bear in mind that I have no time before I start my parental leave to prepare the new culture list for the website. So all that Laura can do until the end of her contract at the end of April is to mark up the lists. There'll be no sound file extraction or uploading for now, until I can get back to this in the autumn, and until we have a new Sound Comparisons administrator to take over from Laura. We just need to leave the recordings and textgrids in the cleanest possible state so that the extraction and uploading can be completed later by somebody new to the process.

AvivaShimelman commented 7 years ago

Yes, I think that was the way we left it, which works for me.

On Mon, Apr 10, 2017 at 7:39 AM, Paul Heggarty notifications@github.com wrote:

As for the timing on the CLs, bear in mind that I have no time before I start my parental leave to prepare the new culture list for the website. So all that Laura can do until the end of her contract at the end of April is to mark up the lists. There'll be no sound file extraction or uploading for now, until I can get back to this in the autumn, and until we have a new Sound Comparisons administrator to take over from Laura. We just need to leave the recordings and textgrids in the cleanest possible state so that the extraction and uploading can be completed later by somebody new to the process.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-292924948, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKM13JNY-tppsLRErPeu30YboWqo1rks5ruhT4gaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

AvivaShimelman commented 7 years ago

Right. We have CLs for all and only those languages for which we have BV lists. Still, I'll make up some kind of informal reference sheet for working purposes, so that Laura -- or whoever -- can match file names to languages (transparent, in any case) and I can link photos or any auxiliary material.

A.

On Mon, Apr 10, 2017 at 7:25 AM, Paul Heggarty notifications@github.com wrote:

@AvivaShimelman https://github.com/AvivaShimelman Aren’t the languages for the culture list just a subset of those we already have for the basic vocabulary list? We only need additions to the for pasting sheet if you want to add wholly new languages.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-292922427, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKMyoWGfm86Midk68KWhuxuV_y4-1Iks5ruhHAgaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

PaulHeggarty commented 7 years ago

Better would be already to change all the file names to the SndComp filenames. (But just in case, first make a backup copy of the file with its original name.) Then there is no more renaming for to be done by anyone, especially people new to the project who might well be lost in all the filenames!

PaulHeggarty commented 7 years ago

We still need some clarifications from @AvivaShimelman on points where we cannot clearly understand what is meant in some of your notes:

"We have CLs for all and only those languages for which we have BV lists."

Do you mean 'languages' in the sense you often use it in, i.e. not as any language variety (including your 'dialect') the sens . Please put numbers on this, so it's clearer. Am I correct in paraphrasing this as:

We do not have CLs for all of the full 135 varieties for which we have BV lists. Rather, we only have CLs for about 40 of those 135 varieties, only for one variety ('dialect') of each of the 40 fully-fledged languages. In other words, we have a CL only for one representative 'dialect' (level 6) of each 'language' (level 5).

PaulHeggarty commented 7 years ago

Trying to understand what you want us to do with the Tape: Tautu and Tape: Tautu 2 recordings.

Should we: (a) Keep these as two separate varieties on the database, distinguished only as 1 and 2. (b) Merge these recordings into a single variety.

Since (if we understand correctly) these are the same speaker, albeit the second time with two others also commenting), I can’t see why we should do (a), but from how you phrased it ("supplement") it is not clear whether you mean (a) or (b).

PaulHeggarty commented 7 years ago

Please also advise us on what we are to do with the other cases where we have a 1 and a 2 recording and the rest of the variety name is identical.

LauraWae commented 7 years ago

Hi also from my part :) @AvivaShimelman This is my heads-up to start on the Cultural List. Please start uploading the files to

ownCloud\SndComp\Malakula\2 - LW DA - To Mark up in PRAAT.

Thanks!

AvivaShimelman commented 7 years ago

Worry: lect/speaker distinction will be lost. Any future analysis will have to consider the possibility that many lects -- the "depopulated" one in particular -- are aggregates in some sense of an albeit constrained range of idiolects. In particular the seeming free alternation between voiced and unvoiced pairs and between adjacent places of articulation is going to be intersting/challenging. Again, you know your system, analytical requirements, and so on. I'll just change file names and make copies.

A>

On Tue, Apr 11, 2017 at 2:40 AM, Paul Heggarty notifications@github.com wrote:

Better would be already to change all the file names to the SndComp filenames. (But just in case, first make a backup copy of the file with its original name.) Then there is no more renaming for to be done by anyone, especially people new to the project who might well be lost in all the filenames!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-293165819, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKM2pGAoltMyXj6Ve8Bf9ogj2n1Henks5ruyBkgaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

PaulHeggarty commented 7 years ago

Worry: lect/speaker distinction will be lost.

When? By doing what?

PaulHeggarty commented 7 years ago

I don’t think ideolects is a reasonable, practical objective for Sound Comparisons, though.

AvivaShimelman commented 7 years ago

You understood correctly. Level 5. We have one example for every level 5 lect agglomeration. Potentially two for Nah/x'ai.

A.

On Tue, Apr 11, 2017 at 8:23 AM, Paul Heggarty notifications@github.com wrote:

We still need some clarifications from @AvivaShimelman https://github.com/AvivaShimelman on points where we cannot clearly understand what is meant in some of your notes:

"We have CLs for all and only those languages for which we have BV lists."

Do you mean 'languages' in the sense you often use it in, i.e. not as any language variety (including your 'dialect') the sens . Please put numbers on this, so it's clearer. Am I correct in paraphrasing this as:

We do not have CLs for all of the full 135 varieties for which we have BV lists. Rather, we only have CLs for about 40 of those 135 varieties, only for one variety ('dialect') of each of the 40 fully-fledged languages. In other words, we have a CL only for one representative 'dialect' (level 6) of each 'language' (level 5).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-293243598, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKM8qMfhGIUSMB3sWnrmTmpPV3DvWfks5ru3C5gaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

AvivaShimelman commented 7 years ago

Bottom line: I want (2) treated as as a revision to (1). I defer to administrative ease, though, as both strategies are defensible. A.

On Tue, Apr 11, 2017 at 8:26 AM, Paul Heggarty notifications@github.com wrote:

Trying to understand what you want us to do with the Tape: Tautu and Tape: Tautu 2 recordings.

Should we: (a) Keep these as two separate varieties on the database, distinguished only as 1 and 2. (b) Merge these recordings into a single variety.

Since (if we understand correctly) these are the same speaker, albeit the second time with two others also commenting), I can’t see why we should do (a), but from how you phrased it ("supplement") it is not clear whether you mean (a) or (b).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-293244429, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKM1kVVM6N09REyeCWvtzKX0dk3YV1ks5ru3GQgaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

AvivaShimelman commented 7 years ago

I don't think I was opaque about this one. Laura got it right from the get-go. All files (and these are the ones I uploaded first two months ago) with the same file numbers as existing recordings on the site modified with an "R" in the number and a "revised" in the title are meant as revisions to existing recordings where formerly inflected items have been swapped out for uninflected counterparts. The transcriptions for these (and only these) were uploaded to my trasncriptions file on 16/2; they are separated from all other transcription in the complete list uploaded 02/3:

15-10-R Avava Tisvel Revised

15-35-R Tape_Tautu_Revised

150-052-R Pangkumu Datisman Revised

160-012Wala Worprev

160044-R Taute Alpalak Revised

160-080-R PortSandwich Marivar Revised

162-087-R Navwien Mbonvor Revised

A.

On Tue, Apr 11, 2017 at 8:27 AM, Paul Heggarty notifications@github.com wrote:

Please also advise us on what we are to do with the other cases where we have a 1 and a 2 recording and the rest of the variety name is identical.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-293244624, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKM1WWIY3qHOnDkFttkC8_hF2Ny1Inks5ru3HFgaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

AvivaShimelman commented 7 years ago

Yuppee! Let's start with that first one: Najit as a test case and see what problems may surface.

A.

On Tue, Apr 11, 2017 at 8:32 AM, LauraWae notifications@github.com wrote:

Hi also from my part :) @AvivaShimelman https://github.com/AvivaShimelman This is my hands-up to start on the Cultural List. Please start uploading the files to

ownCloud\SndComp\Malakula\2 - LW DA - To Mark up in PRAAT.

Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-293245661, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKM2Ae0Jv2rvzFDeZQx-pu9LvMconVks5ru3LkgaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

AvivaShimelman commented 7 years ago

I'm obviously being obtuse on this one. As I understand it, you want me to not create any new metadata entry anywhere for a CL list from the "same lect" for which we already have a BV list. So, for example, if I have a Novol-Dixon Reef BV and I made a Novol-Dixon Reef CL I just maybe tag "CL" on an otherwise identical file name. Can do. Possible problems, though: The CLs are the product of committees (if getting a unique referent for "stone" is hard, try "wail for the dead," "community work day," and "abortion," to name a few), and, of necessity, committees of people from different "dialects," so, although pronunciation will necessarily be that of a single speaker (lect), there's something a bit "higher order" going on. Then there is just the issue of pictures.

Again, sorry for being obtuse. All of this can be left with some transparent labeling and an unofficial metadata sheet until there is time to deal with it ... if there is, indeed, anything to deal with.

A.

On Tue, Apr 11, 2017 at 5:37 PM, Paul Heggarty notifications@github.com wrote:

Worry: lect/speaker distinction will be lost. When? By doing what?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-293407904, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKM4SrG6XgrHq2bt1is1ee0-aHsSAOks5ru_KmgaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

PaulHeggarty commented 7 years ago

Obviously yes, you'll need to add a tag like _CL (for now), but that's not what I mean by renaming. What I mean is changing file names from your type to the Sound Comparisons type, for example .…

Laura will do all this, in any case. You can just leave your original file names as you had them, plus _CL

AvivaShimelman commented 7 years ago

@LauraWae I think you're linked into my CL folder. Everything is really just as it was for the BV lists. Right now, you'll find the Najit, which we can start with, along with the group it is a part of (Malua Bay, Espiegel's Bay, Tirax, Siviti, Batarxopu, Nese). The Tirax 162 recording does have a "sister" R recording. That, too, will work just like the BV-Rs did, with new entries trumping old ones (although I think I've removed any from the first for which there would be any overlap). I will probably yet revise the transcriptions when I do a final listen, of course. The Wowo/Alavas file will require a bit of explanation, but we'll deal with that when we come to it. That'll be the only one that's funky, I think. Some of the recordings have more bird/child noise than others. Those of course, will have to be clipped to the very edges (not as if you didn't know). Again, in general, I've done almost as much as I'm comfortable with in terms of noise reduction, and I fear that more will distort the sound, but you, of course, can play with it and see if there is magic still to be worked. As before - no sound without transcription or transcription without sound. If either is missing, its pair gets thrown out. Beware of gaps/jumps. Anything else, just ask. We can Skype at some point if that would be helpful.

Good luck!

A.

AvivaShimelman commented 7 years ago

@LauraWae wrt file naming In the future, I'll label files as Paul suggests in his last post. For the ones already uploaded, I'll depend on you. Metadata sheet (mini reference sheet on its way). The labeling system for the files already up is the same as it has always been for the BV lists, except for the prefix. BVs are prefixed with "wl" while CLs are prefixed with "wlc". As before, they all have unique numbers and are labeled for language and dialect (where they do indeed represent a unique dialect).

AvivaShimelman commented 7 years ago

@LauraWae Attached is a spreadsheet (also uploaded to "reference") listing the CL recordings along with their original file names (mine) and their future SC file names. In those instances (all but 8) where there is a BV recoding that corresponds perfectly (i.e., in both language and dialect) to the CL recording the name is the same, with the only difference being that the CL file is appended with a "CL." The index number is also the same. In the remaining 8 cases, the recording is of the same language but not necessarily the exact-exact same dialect (or, more frequently, not necessarily of a single dialect, being the work of a committee of speakers of different dialects). In these cases, I have followed out the names and indexes as far as is possible (i.e., to level 6 and left the rest to be filled in however Paul specifies (I imagine this will just mean assigning them their own level 5 codes). As all other information -- ISO/Glottolog codes, L&L ... -- will be the same as for the corresponding BVs, I didn't repeat this. I did, however, fill in recording place and date information as well as speaker names, as these may differ. Note that some recordings are appended with an "SC". This indicates a recording that supplements another. On the list, an "SC" recording will figure immediately underneath the base recoding that it ammends. There are 7 of these. They were made in those cases in which later consultations revealed either that that the original speakers had erred or items about which the original speakers had been unsure were ultimately retrieved. I hope I haven't been too opaque in this.
2017 04 12 CL metadata for SC.xlsx

LauraWae commented 7 years ago

Hi, Thanks for the sheets and also for the aclarations to the gloss list from two weeks ago. I have finished the first two CL-recordings now. This means that I have tagged them with Praat. You can have a look at them on

SndComp\Malakula\Culture List\02 Tagged Files and Textgrids.

You will notice that I have split the Malakula folder into two main subfolders - one for the BV, and the other for CL.

With those two files I have worked with by now there was nothing extraordinary happening. I used your transcriptions to check the spellings. Occasionally, I would delete transcriptions where there was no sound in the pre-selected recording. That's all.

As you have asked in #460, this is my answer: It takes me about 40 minutes to tag a recording.

AvivaShimelman commented 7 years ago

Wow, that really isn't a long time at all. For a couple of them, I'll put editing notes when I upload. That usually just means, like with the BVs, that the recording has more "interference" than I am happy with, so that I will have already pushed the noise reduction and the only solution is just to be sure to cut them at the very edges. A.

On Wed, Apr 19, 2017 at 9:20 AM, LauraWae notifications@github.com wrote:

Hi, Thanks for the sheets and also for the aclarations to the gloss list from two weeks ago. I have finished the first two CL-recordings now. This means that I have tagged them with Praat. You can have a look at them on

SndComp\Malakula\Culture List\02 Tagged Files and Textgrids.

You will notice that I have split the Malakula folder into two main subfolders - one for the BV, and the other for CL.

With those two files I have worked with by now there was nothing extraordinary happening. I used your transcriptions to check the spellings. Occasionally, I would delete transcriptions where there was no sound in the pre-selected recording. That's all.

As you have asked in #460 https://github.com/lingdb/Sound-Comparisons/issues/460, this is my answer: It takes me about 40 minutes to tagg a recording.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lingdb/Sound-Comparisons/issues/446#issuecomment-295267620, or mute the thread https://github.com/notifications/unsubscribe-auth/ARGKMxBqDKa8reI-ApRUti-f1MMVlZWlks5rxgorgaJpZM4MLn1Q .

-- Aviva Shimelman, PhD

LauraWae commented 7 years ago

Hi on day # 3.

Two things: I have just finished wlc_1630064_Siviti_Gonwar_selected. There was a big break between "land_of_the_dead" and "doctor", where transcriptions went in italics. Do I have to take special care about something in that case?

And, secondly: It looks like the recording wlc_1620034_Tirax_02_Mae_selected does not match the transcriptions named "Tirax Lani". I am confused because of that and I wanted to know if those are the right transcriptions I am choosing or if I need to look for them somewhere else.

Thanks a lot in advance.

LauraWae commented 7 years ago

Additionally:

Could please also indicate which of those labels

image

is

wlc_1620046-47 Siviti Batarxopu wlc_1620048 Wowo.

It's somehow not clear to me and I am very grateful for you assistance. Thanks.

AvivaShimelman commented 7 years ago

@LauraWae Right. Don't worry about the italicized items in the Gonwar recording for now. Tirax_Mae and Tirax_Lani are different recordings. The transcription for Tirax_Mae is labeled "Tirax 163"; the transcription for Tirax_Lani is labeled "Tirax_Lani". I appologize. That could have been more clear. Note that the the supplement/revision to the Tirax Mae recording is labeled with the same file number appended with an "R" and file name appended with "Revised". A.

AvivaShimelman commented 7 years ago

@LauraWae I'll re-label the transcriptions to include the numbers (I hadn't because, in general, unlike with the BV lists, we only have one recording per language for the CV lists — we just hit the exceptions first, it seems). I'll be uploading those in a minute or two, so it will be clear which is Siviti and which is Batarxopu. My bad. Wowo is Wowo. More explanation following upload. I'm s glad we're starting in on these! Are you going to be working over the weekend? Do you want a fresh set? A.

AvivaShimelman commented 7 years ago

@LauraWae So I've re-uploaded the north-center transcription sheet to number the transcriptions. I think it worked before when we went with the rule "Always go with the numbers." So I'll keep to that system. The authority is the metadata sheet. Whatever numbers it gives are the ones we're going with. I'm re-uploading now the Batarxopu and Siviti sound files re-labeled. Those are unique. The were actually done by the same speaker (mother spoke one language, father spoke the other; the two villages were separated by a stream). SO he gives each word two times, the first time in Siviti and then the second time in Batarxopu. It should always be clear. I prompt it. He gives the Siviti. Then I say, "Narasaid" ('other side') and he gives the Batarxopu. In any case, the responses are distinct enough that you should be able to read them off the transcripts.

AvivaShimelman commented 7 years ago

@LauraWae wrt to Wowo/Alavas files. That one was a challenge because the language is really in decline. The principal speaker/first group got a bit more than half. There are three older speakers who, in different sessions, did manage to agree on about 20% more of the list. I recorded two of them. So our entire set for Wowo/Alavas is a bit "composite". It should be pretty transparent, though. There are three recordings uploaded and there are three different columns in the transcriptions. I've grouped the transcriptions in the new "color" sets. In the metadata, these are separated by spaces.

AvivaShimelman commented 7 years ago

@LauraWae I've re-labeled and re-uploaded the Wowo/Alavas sound files. I don't know if things got better or worse! I've named the files as in the metadata sheet (for example wlc_1620050_Wowo/Alavas_Lesmarlas) but they appear on oc without the numerical prefix (example: Wowo_Lesmarlas_selected). They're still unique, so it shouldn't be hard to tell them apart (I hope). 48 is Wowo, 49 is Alavas and 50 is Lesmarlas

LauraWae commented 7 years ago

Aviva's comments from 23th and 24th of April in #458

@LauraWae Mornin'! I uploaded the northern set (Wowo-Alavas, V'ao). V'ao has got a lot of bird noise in the background (semi-outdoor recording at 17:00). The speaker does generally manage to beat the birds when he's talking, but the darn chirpers can be heard immediately at the edges. (Sorry!). There's a set of about 20 items that I've reserved in a separate column (it'll be obvious). There's nothing that has to be done with those right now.

A.


@LauraWae Morning! Are you going to need more recordings to play with today? I've put Avava, Fifti and Tasmbol in the common folder. Please, please, please do them in that order and take them one by one, i.e., only as you do them. That way, if there have to be changes, those can be made before anything gets into the gears. I forgot to say with regard, in particular to the V'ao recording: Sometimes consultants give two different forms. Both will appear in the transcription and on the recording. In the transcriptions, I separate alternative forms with commas. This is the only purpose for which I use commas, so a quick, old-fashioned search will pick them out if you worry you didn't catch them. I think I remember Paul saying that there is a way to register two different responses. If there isn't, I've put my first pick, not counter-intuitively, first. There re never too many of these (anywhere from 0-6 in any recording?). It can be important to record both in cases, for example, where one form is cognate with those in neighboring languages to one side and the other with neighboring languages on the other side. I'll be going to sleep soon so I'm afraid I won't be able to respond to emails until all the hustle and bustle of "Jena Tuesday" has slowed.

AvivaShimelman commented 7 years ago

@LauraWae Long story short: I'm adding a second Neverver recording. But -- don't be daunted! -- it's a partial file.

Right now we have the Limap dialect on our list. I was just reviewing it and I realized that I only recorded those items where Limap differed at all from the other dialect I recorded, Mindu. It was Mindu that I had recorded all completely. With Limap, we had reviewed the whole set and then, when it came time to record, the the man -- who was quite elderly -- just started to get very tired, so I picked out those where the two dialects different in any regard -- maybe a quarter. That said, It's already pre-edited and transcribed, so there's no reason not to use it, I don't think.

AvivaShimelman commented 7 years ago

@LauraWae Is today really the end of your contract? Do you know when you'll have a replacement? Whenever that will be, I guess, I'll just stockpile recordings and transcriptions for them. Good luck on your next stop! I'd be glad if you dropped a line every once in a while. A.