r4fun / hierplane

🌳 Hierplane for R
https://r4fun.github.io/hierplane/
Other
9 stars 0 forks source link

hierplane fails with multiple sentences #30

Closed tylerlittlefield closed 4 years ago

tylerlittlefield commented 4 years ago

While trying to reproduce the hierplane you see here, I ran into the following error:

hierplane::hp_spacyr("With Hierplane you can explore linguistic structures naturally. Try it for yourself!")
#> hierplane::hp_spacyr("With Hierplane you can explore linguistic structures naturally. Try it for yourself!")
#> Error in word_locs[["start"]] : subscript out of bounds
#> In addition: Warning messages:
#> 1: In if (is_punct) word <- paste0("[", word, "]") :
#>   the condition has length > 1 and only the first element will be used
#> 2: In 1:word_id :
mathidachuk commented 4 years ago

if you do get_sents and then lapply it should work.

sents <- hierplane:::get_sents("With Hierplane you can explore linguistic structures naturally. Try it for yourself!")
lapply(sents, function(x) hierplane(hp_spacyr(x)))

We can definitely move get_sents() into hp_spacyr() tho. Thoughts?

mathidachuk commented 4 years ago

You can actually split spacyr_df() by sentence_id as well, but then hp_spacyr() will produce either a json output or an output with a list of jsons.

TLDR: you want this?

lapply(get_sents(txt), function(x) hierplane(hp_spacyr(x))

or this?

lapply(hp_spacyr(txt), hierplane)
tylerlittlefield commented 4 years ago

I definitely prefer the second, but I wonder if we should just handle in hp_spacyr so the user doesn't need to use lapply to begin with.

mathidachuk commented 4 years ago

Would you be ok with them getting a list of hierplane objects from hierplane? They will need to handle a list output at some point if they just want to use heirplane.

What we can do is wrap multiple objects into a slickr or some sort of aggregated display object if necessary if they decide to use hierplane in shiny.

tylerlittlefield commented 4 years ago

Hm, well I am not totally against lapply(hp_spacyr(txt), hierplane) if we add an informative stop() when there's more than one sentence. I am mainly just curious in how we recreate the hierplane from the original documentation.

mathidachuk commented 4 years ago

Ah got it. Actually it is possible (assuming you don't want the styling to be automatically determined as well!) if you want to link them with a higher level root.

Here's my rough attempt at replicating the example: image

I disabled the span highlighting to get this to work in a pinch. There should be some pretty straightforward logic we can implement to get the spans to work again. I just don't think linking it at the top is very practical unless you can effectively control the size of each plane. As you can see the end results right now is not very pretty...

The code below is pretty easy to implement I think, but I prefer the lapply/slickr way.

txt <- "With Hierplane you can explore linguistic structures naturally. Try it for yourself!"
df_spacy_init <- hierplane:::spacyr_df(txt) %>% 
  mutate(head_token_id = paste(sentence_id, head_token_id),
         token_id = paste(sentence_id, token_id))
# demote rootes
df_root <- df_spacy_init %>% 
  filter(dep_ %in% "ROOT") %>% 
  mutate(head_token_id = "1",
         dep_ = "verb") 
df_root <- bind_rows(
  df_root,
  data.frame(
    head_token_id = "1",
    token_id = "1",
    token = "and", 
    dep_ = "ROOT"
  )
)
df_children <- df_spacy_init %>% 
  mutate(dep_ = tolower(dep_))

df_spacy <- bind_rows(df_root, df_children)%>% 
  mutate_if(is.logical, ~tidyr::replace_na(., FALSE)) %>% 
  transform_logical()

settings <- hierplane:::spacyr_default() %>% 
  add_styles(
    node_type_to_style = list(
      "ROOT" = "placeholder",
      "verb" = "color1",
      "nsubj"= "color2",
      "pobj" = "color3",
      "dobj" = "color3",
      "aux" = "color4",
      "amod" = "color4",
      "advmod" = "color4",
      "prep" = "color5",
      "punct"= "placeholder"
    ),
   link_to_positions = list(
     "nsubj" = "left",
     "pobj" = "right",
     "dobj" = "right"
   ) 
  )

spacy_tree <- build_tree(df_spacy, title = txt,settings = settings)
hierplane(spacy_tree)
mathidachuk commented 4 years ago

@tyluRp - reminder to discuss multi-sentence handling

mathidachuk commented 4 years ago

After playing with it a bit more, I still think the only sound solution is returning an hp object per sentence. If no objection I can go ahead and implement.

tylerlittlefield commented 4 years ago

I think this is a good approach. Multiple sentences will return a list of hierlane objects right?

mathidachuk commented 4 years ago

That's exactly it! If sent = 1, return a single hp object. If sent > 1, return a list of hp objects. And then user can lapply(list_of_hp, hierplane) to get all the widgets if they want.

On Wed, Jul 22, 2020 at 8:58 PM Tyler Littlefield notifications@github.com wrote:

I think this is a good approach. Multiple sentences will return a list of hierlane objects right?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/r4fun/hierplane/issues/30#issuecomment-662807183, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBQCYN32TITHBOU32IV3J3R46YPLANCNFSM4OX7TWTA .