brucknerp / jpn_gen_pop

1 stars 0 forks source link

Project Update #3 #4

Open gmd29 opened 6 years ago

gmd29 commented 6 years ago

This week we finalized our song selection. We marked up the structure of several songs and wrote a schema for the songs. The schema has a placeholder called "insert" for when we start marking up the morphology ans syntax in the songs.

Over the next week we will mark up the structure of the rest of the songs (Gina) and discuss the characters that we will be looking for in the song so that I (Gina) know what to look for. We'll also fix the schema to include these features and start marking them in our documents.

JosephDRogers23 commented 6 years ago

I'm curious to see what songs you guys have selected. Are you able to provide links to music videos? How well do you think the songs fit into your project question?

enb34 commented 6 years ago

I'm interested to see how you'll be marking up syntactic features within your text. If you're focusing on something like formality or sentence structure, which might stretch across line breaks, how will you keep the hierarchy of your code while still tagging the relevant features? As far as your schema goes, will you be including a set of characters allowed in the schema, or will you be leaving this as open text and sorting through the data afterward?

brucknerp commented 6 years ago

@JosephDRogers23 There's a excel sheet of our song list if you're interested in checking some songs out! Japan is really strict about copyright for music videos so it's hard to find a lot of the newer songs around (unless it's a cover). As for whether they fit, they are all very popular songs of their time, and (for what it's worth) most Japanese people I talked to knew the majority of them. And while I was transcribing all the songs, I picked up on a lot of the features I want to track.

@enb34 Japanese is nice because tagging something like formality usually comes at the end of a line/sentence and I think it would be fine to just tag that particular ending/particle instead of the whole sentence. From what I saw in the lyric files, there isn't much issue... For features allowed in the schema, I am finalizing a list of characters/words/phrases to search for, and depending on that we might not put the exact characters directly in the schema, but leave variable names for them.