IPS-LMU / emuR

The main R package for the EMU Speech Database Management System (EMU-SDMS)
http://ips-lmu.github.io/EMU.html
23 stars 15 forks source link

dur() does not work on tibble-typed segment list #234

Closed MJochim closed 3 years ago

MJochim commented 4 years ago

The dur() function does not work on segment lists that have been created with query(resultType ="tibble"), which has been the default since emuR 2.0.0.

emuR::create_emuRdemoData("~/")
ae = emuR::load_emuDB("~/emuR_demoData/ae_emuDB/")
sl = emuR::query(ae, "Phonetic == n", resultType = "tibble")
sl_old = emuR::query(ae, "Phonetic == n", resultType = "emuRsegs")

emuR::dur(sl_old) # works
emuR::dur(sl) # errors out
Error in UseMethod("dur") : 
  no applicable method for 'dur' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"

Even explicitly calling dur.emusegs does not work:

WARNING DEAR USERS: Never do the triple colon thing. It is for developers only and will break your scripts.
emuR:::dur.emusegs(sl)
Error in attr(x, "tsp") <- c(1, NROW(x), 1) :
  invalid time series parameters specified

The emuR:::dur.emusegs(x) function does a lot of end(x) and start(x) and apparently, when called with a tibble-typed segment list, it uses stats::end instead of emur:::end.emusegs.

At some point, this should be fixed or functions like dur be marked as deprecated.

raphywink commented 4 years ago

Well it isn't deprecated but it only works on the legacy S3 class. We decided back then to stick to the default "tibble" class and not create a super-class to have class function available for it. That is why you now have to simply go tbl$end - tbl$start to get the dur. times. I am aware that this is a breaking change and that is why it was the main reason why we did the version bump to 2.0.0

raphywink commented 4 years ago

Same goes for eplot() dplot() and a few other functions btw

MJochim commented 4 years ago

I am aware of that and tbl$end - tbl$start is exactly what I told the user who reported this. I was not aware that even the dur() function had broken with the upgrade to tibble, but it totally makes sense when I think about it. And actually, when I wrote this:

At some point, this should be fixed or functions like dur be marked as deprecated.

What I really meant was that dur should be deprecated and not fixed. We have been internally “phasing out“ things like dur and dplot, so why not put a note in the manual pages? I wouldn’t call it a priority, either. But I think that eventually it would be the right thing to do.

Well it isn't deprecated but it only works on the legacy [...]

That is more or less the definition of deprecated ;-)

raphywink commented 4 years ago

Well it is sort of a question of what we define as deprecated. tibbles are the new fancy pants hip and happening data.frames but are all data.frames supposed to be marked as deprecated because of them? I don't think so.... same goes for emusegs vs. the new tibble segment list type as one doesn't deprecate the other. We do however recommend using the newer thing because of obvious reasons though...

raphywink commented 4 years ago

Or in other words: if there is backward compatibility that means there is no deprecation (unless the plan is to get rid of the backward compatibility in the short/long run)

MJochim commented 4 years ago

We do however recommend using the newer thing because of obvious reasons though...

And we teach our students to do that, and the manual has that focus, and we recommend our colleagues to drop the old way and adopt the new way, and we don’t really maintain the help pages for the old functions (otherwise help(dur) would state that it does not work with the default query()).

In my opinion, there are very clear indications that what we are doing is, in fact, a deprecation of things like dur(). However, I can see what you’re saying about backward compatibility. In the end, it’s a matter of what exactly you would or wouldn’t mean by “deprecation.” And that’s not something we need to settle ;-).

My suggestion is to add a note to help(dur), help(dplot) etc. saying that these functions are [deprecated|not recommended|not fully supported|a candidate for considering a different option|whatever the wording].

I don’t think this is a pressing matter, but I do think it would improve the documentation.