Closed matthewhirschey closed 4 years ago
I created a PR to implement the suggestion above but don't know that it 100% solves the issue. Pathways seem really far down and seems unlikely to be noticed now. Should we remove approved_name searching?
Perhaps if we had several examples of searches and what the users would be most likely looking for it would be more obvious what to change. Or what they might be thinking when performing various expected searches.
The PR looks OK to me. The most popular genes by gene_id are on the methods page of ddh.org (TP53, etc.). But the normal search behavior should be: I'm looking for a gene, and therefore genes will be at the top of the list. If I'm looking for a pathway, then few genes should come up (?) and the pathways should float to the top. If "approved name" is causing too many spurious results to come up, then perhaps
Might be a good alternative.
I also added (and committed) just now some code that will arrange each sub-table by the length of the returned query. For example:
genes_data_symbol <- gene_summary %>%
filter(str_detect(approved_symbol, find_word_start_regex)) %>%
mutate(length = str_count(.[[1]])) %>%
arrange(length) %>%
head(limit_genes) %>%
select(-length)
By doing this, the shorter terms (and therefore better matched terms) are returned first. However, I now see a random error, that I'd like you to see if you can recreate in your branch (or if I introduced it just now); my guess is that it is in your branch too..
Warning: Error in writeImpl: Text to be written must be a length-one character vector
Perform these searches to recreate/test (case does not matter)
Also errors: MDM2, MDM4 (after a quick search of some of the test genes)
Can you look at this @johnbradley ?
Will do @matthewhirschey . I think this error means that we are trying to display a vector of multiple items where shiny is expecting a single item for the content of an HTML tag.
Issue fixed by @johnbradley
While it seemed OK to have to scroll to find the gene of interest (in a small pond of genes), in the case of "TP53", you never get to see the actual gene, because the threshold limits of head=10 means that several other alphabetically ranked genes push TP53 off the bottom of the list.
Need to think about a better way to return gene of interest.
One idea: sequential search. Instead of str_detect... | str_detect, can we
gene_name %>%
(most specific)aka %>%
(most likely alternative)approved_name
(most generic)And then row_bind, but never resort? And then present up to 20 (10 genes, 10 pathways, max) but probably fewer choices?