IPS-LMU / emuR

The main R package for the EMU Speech Database Management System (EMU-SDMS)
http://ips-lmu.github.io/EMU.html
23 stars 15 forks source link

The 'timeRefSegmentLevel' argument of requery_seq() #210

Closed FredrikKarlssonSpeech closed 3 years ago

FredrikKarlssonSpeech commented 5 years ago

For me requery_seq() is not doing what I expected it to, but to be honest I cannot be sure. I am very confused by the documentation, but need to ask specifically about the timeRefSegmentLevel argument before I make a suggestion.

"set time segment level from which to derive time information. It is only necessary to set this parameter if more than one child level contains time information and the queried parent level is of type ITEM."

I get the first part. If there is no time information in the ITEM, then you have to get it from somewhere. But, why are only child levels referenced here? Could not a parent level also determine start and end times (in simple cases)? Is there a reason why this documentation just not say that you should state which level to get time information from?

raphywink commented 5 years ago

If you have two child levels: ORT (ITEM) -> KAN (SEGMENT) -> MAU (SEGMENT) (ORT is the parent level that you are performing requery_seq() on) and the query engine has to decide where to get the time information from it has to guess (which it doesn't). So you have to specify it: timeRefSegmentLevel = "KAN" or timeRefSegmentLevel = "MAU" (the times might also vary of course).

@parent levels determining time: do you have an sensible example for me? I think this would be very iffy annotation structure modelling and can't really think of an example where it would be useful or a hierarchical query wouldn't solve the problem and be more explicit. Most if not all EMU hierarchies I have seen have course grain levels inheriting time information from more fine grain levels (and/or sometimes carry time information themselves):

# let's say ORT and MAU are of type SEGMENT
query(db, "[#ORT == hello ^ MAU == l]") #-> will give you The ORT labels & times
query(db, "[ORT == hello ^ #MAU == l]") #-> will give you the MAU labels & times
# is this what you would suggest???:
query(db, "[ORT == hello ^ #MAU == l]", timeRefSegmentLevel = "ORT") # isn't the above cleaner? and what you expect the ouput to be? MAU labels & ORT times?
FredrikKarlssonSpeech commented 5 years ago

Yes, I was mainly asking because the docs says specifically that child levels are where time information can come from. Sensible example, well that depends on one's perception of course.

There are some projects that do on line intelligibility assessments for instance. There the rater pushes a button when a part cannot be understood, and this creates a log file. Of course, you cannot rely on the time of the log file to be the actual time of the reduced intelligibility, but likely you can connect it to a higher level unit. With some lag. And it makes no sense to have the stream of markers of reduced intelligibility at a parent level to an utterance (for me at least).

Not the most usual case, I agree. But the question was rather - it deriving times from parents levels already possible, or was the docs correct in describing the now implemented functionality as being "limited" to deriving times from child levels. Got the answer there already.

raphywink commented 5 years ago

Honestly, I havn't come accross such a hierarchy yet and havn't tried if it works or not (it might!). Do you have a minimal working example + dataset I could try it on?

raphywink commented 5 years ago

Just did a quick scan of the code and it looks like this function:

https://github.com/IPS-LMU/emuR/blob/2edd12d465587eebad3b0b6abd266d64d51faf82/R/emuR-database.DBconfig.R#L145

only looks "down" the hierarchy (going from super to sub level.. to super to sub etc.). So it seams that the current query engine can only derive time information from "child" levels. Would still be good to test it though...

raphywink commented 3 years ago

this issues seems to be resolved closed right? If not please reopen