@frhl These are comments/suggestions I wrote down when I was testing the current app version. I show them as checklist here but we can discuss to modify them as appropriate. I'm also starting to get into some minor cosmetic details for features that are basically done.
Left Bar:
[x] "Input bait" -> "Input bait (HGNC symbol, e.g. BCL2)"
[x] "Search" -> "Search HGNC symbol"
[x] "Significance" -> "Significance metric"
[x] "P-value": italicized P (do this everywhere, to be consistent with plot axis labels)
[x] "LogFC" or "logFC" -> "log2 FC" (with 2 as subscript, again do this everywhere)
Basic Plots: Basic plot options
[x] "Scatterplot" -> "Replicates to compare in scatter plot"
[x] spell out "Replicate 1 vs. Replicate 2" etc. in drop down menu
[x] here and everywhere else we should do FDR<=X (instead of <) for all FDR cutoff
[x] "FDR<0.1,logFC>=0" -> "Color for FDR<=0.1 and log2 FC>=0" etc.
[x] but when input data only have 2 replicates: replicate drop-down menu throws error
Basic Plots: Summary
[x] "FDR<0.1, logFC>=0" -> "Enrichment threshold: FDR<=0.1 and log2 FC>=0"
[x] add replicate correlation results. Maybe a small table that shows correlations for all replicate pairs, with last row showing average correlation
[x] "Download" -> "Enrichment Statistics" (or some more descriptive name to indicate what's being downloaded)
[x] "all_gene_names" column in downloaded file: is this implemented so it can be different from "gene" column? If not, no need to output this column.
[x] add "accession_number" column in download file in original input uses accession (instead of HGNC symbol)
Basic Plots: Volcano plot
[x] don't show horizontal dashed line when using FDR as cutoff
[x] bait hover label: "baitlist" -> "bait"
[x] color legend: show legend (on right side?) for the two color circles (maybe with "FDR<=0.1 and log2 FC>=0" etc. as legend text). The current color key on the left is not that informative and can be removed
Basic Plots: Scatter plot
[x] "Replicate 1 log2(Fold change)" etc. for scatter plot axis labels
[x] remember to change axis labels in downloaded plot file as well (right now it's not consistent)
[x] add color legend (same as volcano plot)
[x] by default, expand and show scatter plot for first pair of replicates (if available) upon data upload (right now it's collapsed by default)
[x] add dashed identity line y=x to plot
Integrated Plots: blue boxes
[x] still need to discuss how to deal with multiple lists of each data type in terms of plotting, displaying, downloading results, etc.
[x] "GWAS catalogue" -> "GWAS Catalog" (the official name)
[x] "gnomAD constraints": An alternative would be to have sliders for pLI score (and o/e score), and all proteins with score > selected threshold would be highlighted in volcano plot just like for all other data types.
[x] GWAS catalog downloaded files: don't need to output plotting parameters?
[x] add download buttons for all data types (+ InWeb, gene list, gnomAD)
[x] for all downloaded mapping files: add "significant" column
[x] ordering: maybe move gnomAD after GWAS catalog?
[x] Upload SNPs/genes: File description above upload box is outdated, refer to updated Supplementary Protocol for new descriptions. (e.g. for SNPs: "Tab-delimited file containing two columns: “listName” (name for each list) and “SNP” (rsID)"; for genes: "Tab-delimited file containing two columns: “listName” (name for each list) and “gene” (HGNC symbol)".)
Integrated Plots: Volcano plot
[x] make all details consistent with Basic Plots version
[x] add color/shape legend (and remove color key bar)?
[x] colors not correct until Settings section is expanded
[x] for genes upload, legend should show gene list name (instead of "Upload")
[x] for SNP/gene lists, maybe add "SNPs" or "Genes" prefix to labels/legends in volcano plot so it's clear what type of lists are plotted.
Integrated Plots: Venn diagrams
[x] each row is a different data type/list (e.g. "InWeb", "GWAS Catalog", "SNPs: list1", "SNPs: list2", ... and so on)
[x] GWAS catalog: show a single Venn diagram (combining results from all searched traits, don't show single vs. multi-gene loci because cannot get independent SNP list)
[x] SNP lists: specify that user input should contain independent SNPs for each list (so can subset into single vs. multi-gene loci without worrying about caveats)
[x] download buttons: plot and gene names files for each Venn diagram
[x] revise text explanation for the Venn diagram groups: N=total population, A, B
[x] InWeb tab: add bait name in description? (e.g. "B = BCL2 InWeb interactors"); don't need to specify "...in pulldown" for defining B (to be consistent with B description for all other tabs)
[x] InWeb and gnomAD tab: "N" should be changed to "Total population", "pulldown" should be one word
[x] SNP upload tab: "SNPs upload" (to be consistent with "Genes upload"). Also should be "Total population = pulldown..." B description: "Genes mapped from ..." all/multi-gene/single-gene loci.
[x] P-values do not match manuscript results in some circumstances (e.g. Fig 1b CRBN InWeb overlap p-values, Supplementary Table 2 TARDBP InWeb and ALS genes overlap p-values). The differences are very minor but would be good to figure out why they exist.
Protein Family:
[x] maybe rename as "Pathway Annotations" to include HGNC protein family, GO cellular localization, Uniprot localization, MSigDB pathways, etc.
[x] @frhl TO DO: make pathway annotation plotting function
[x] @yuhanhsu TO DO: curate all pathway data sources, probably just need to store as data.frames (instead of hash objects)
[x] @yuhanhsu TO DO: get_pathway_annotations function
[ ] later: add protein family removal feature here
[x] Names for green boxes: " Volcano plot" and "Annotation table"
[x] order legend by set frequency
Exclusive option:
[x] (a) slider for prioritizing protein families
[x] (b) search box for pathways and gene set names (like in the GWAS catalog)
Download:
[x] remove tab. No need for this anymore
Miscellaneous:
[x] consolidate all *.RData documentation into a single file? -> R/data.R
[ ] test inputting data with pre-calculated enrichment stats (logFC, pvalue, FDR) with no data for separate replicates. Test all combinations of possible input format described in protocol.
[x] test shiny app using a few datasets included in manuscript to ensure consistency (check #s, p-values, etc.)
[x] update Welcome Guide (can be modified from Supplementary Protocol in manuscript)
[x] confirm that bait (when indicated in left bar) is excluded when calculating overlap statistics in Integrated plotting module
[ ] check all download buttons are present, named appropriately, linked with correct files, etc.
[x] before uploading to LageLab GitHub, replace test data files in tests/testthat/data with a published dataset? (e.g. CRBN PPI data used for Figure 1) Alternatively, maybe just mask identity of the dataset (i.e. don't specify which bait/cell line)
[x] inweb. Deal with synonym inweb interactors. Perhaps, display a message if the interactor name was found in the pulldown but not in the dataset.
[x] give warning when some input cols were not accepted.
[x] when input is entirely invalid give informative error message.
Debug
[x] sizes for gene set annotations are not consistent. This is a known error in plotly (add_genoppi_trace.R) in which sizes are specified in marker and not by a size dict. Brainstorm how to work around this.
[x] Basic plot options: color selection boxes: fill with corresponding colors (like in Integrated plotting tab)
[x] Volcano plot: remove FDR color side bar
[x] Summary": add "Replicate correlation" in text above correlation table
[x] hover label: what is "Pulldown (N)"?
[x] downloaded plots: bait point size too big, other searched proteins not labeled, fix scatter plot axis labels and lines
Integrated plotting
[x] Volcano plot: remove FDR color side bar
[x] Volcano plot: legend text not complete: should show log2 FC cutoff as well. brainstorm a way to show both significant + non-significant colors for each data type
[x] gnomAD: use a different default color (red should be reserved for bait). orange or magenta?
[x] gnomAD: remove text at the bottom of the box (doesn't seem to be showing correct/useful info anymore?)
[x] Settings: add options for gnomAD
Integrated plotting: Venn diagrams
[x] show tabs in same order as blue boxes (InWeb, GWAS catalog, gnomAD, SNPs upload, genes upload)
[x] SNPs upload: "Multi loci" and "Single loci" -> "Multi-gene loci" and "Single-gene loci". need to change Venn diagram title accordingly (right now says "All genes" for everything), can just use "All loci", "Multi-gene loci" and "Single-gene loci" as Venn diagram titles
[x] GWAS catalog: missing Venn diagram title
[x] add N, A, and B info for SNPs and genes upload
[x] some N, A, and B numbers look weird? (e.g. GWAS catalog and gnomAD)
[x] discuss/confirm exact text/counts to show for each Venn diagram tab
Gene set annotations
[x] drop-down menu GO descriptions "GO terms: molecular function", "GO terms: cellular component", "GO terms: biological process"
[x] "Search HGNC symbol" not working (Error: could not find function "add_markers_search")
[x] "Search gene set" not working (Error: could not find function "add_markers_search_pathway")
Lower priority
[ ] Basic plotting: changing significance thresholds should not reset selected colors to default
[ ] Venn diagrams: change all "pull down" or "pulldown" descriptions to "proteomic data" (since the data might not necessarily be from pulldown experiments)
[ ] Venn diagrams: italicized P in "P-value" (for InWeb, gnomAD, Genes upload)
[x] Venn diagrams, SNP upload: tab name -> "SNPs upload". "Total population in..." -> "Total population =..."
[ ] Get set annotations: "alphabetical" legend ordering not working?
@frhl These are comments/suggestions I wrote down when I was testing the current app version. I show them as checklist here but we can discuss to modify them as appropriate. I'm also starting to get into some minor cosmetic details for features that are basically done.
Left Bar:
Basic Plots: Basic plot options
Basic Plots: Summary
Basic Plots: Volcano plot
Basic Plots: Scatter plot
Integrated Plots: blue boxes
Integrated Plots: Volcano plot
Integrated Plots: Venn diagrams
Protein Family:
Exclusive option:
Download:
Miscellaneous:
Debug