tidyomics / plyranges

A grammar of genomic data transformation
https://tidyomics.github.io/plyranges/
137 stars 19 forks source link

anchoring redesign. #47

Open sa-lee opened 6 years ago

sa-lee commented 6 years ago

There was some general confusion about this during the talk/workshop at Bioc2018. People felt it was not intuitive that the anchoring is dropped after a mutate call and that anchoring decorates a GRanges object. There was a suggestion of making anchoring act like dplyr's scoped variants so something like:

gr %>% mutate_at_anchor(..., anchor = "start")

could be a possibility. I think I need to think about this a bit more though.

lawremi commented 6 years ago

Since there are only three possible anchors (five for stranded), we could just enumerate: mutate_with_fixed_start(), mutate_with_fixed_end(). Another idea is kinda like ascending() in dplyr. Just cast the width assignment:

mutate(anchor_at_start(width = width * 2), score = score / mean(score)

That's attractive because it doesn't cause explosion in the verb names, and it localizes the anchor aspect to the width manipulation, where it actually matters. There's no decoration of the GRanges although there is a virtual decoration of the expression, but that's not a big deal.

sa-lee commented 6 years ago

I like that, I think the code would look more like dplyr if you set width outside the call to anchor_* and there's no confusion for how update columns using mutate()

mutate(width = anchor_on_start(width * 2), ...)
lawremi commented 6 years ago

Yea, good point. I think the preposition should be to, like anchor_to_start(). At least that's how we would say it in English.