jtlovell / GENESPACE

Other
188 stars 25 forks source link

Higlighting small region or gene of interest #160

Open dmacguigan opened 3 months ago

dmacguigan commented 3 months ago

First off, thank you for the fantastic tool @jtlovell!

I noticed that there are some old issues (#121 and #127) where folks were interested in highlighting a single gene or small region. I have encountered the same issue. For example, with an ROI bed file like this:

   genome         chr   start     end
1   Masi NC_059344.1       0   10000

I end up with the entire chromosome highlighted because there is only one large synteny block spanning several Mbp.

image

Do you have any advice on how to plot ROIs at smaller scales? Ideally down to a single syntenic gene pair.

jtlovell commented 3 months ago

Are you using useOrder = TRUE? If so, you've asked for the first 10k genes, not 10k bp. If you are using useOrder = FALSE and still getting this, then it is indeed a bug and I will try to figure it out.

dmacguigan commented 3 months ago

Nope, useOrder = FALSE. Here's my full call of plot_riparian

ripDat <- plot_riparian(
  useOrder = FALSE,
  gsParam = out,
  minChrLen2plot = 5000000, # only plot scaffolds > 5 Mb
  invertTheseChrs = inv_df,
  refGenome = "Masi",
  syntenyWeight = 1,
  chrLabFontSize = 3,
  backgroundColor="grey",
  chrFill = "lightgrey",
  addThemes = ggthemes,
  braidAlpha = .75,
  highlightBed = roi_test,
  useRegions = FALSE,
  pdfFile="./test4.pdf")
jtlovell commented 3 months ago

I'm not sure ... theres only one obvious possibilities answer in the code: "end" %in% names(data.table(roi_test)) is FALSE ... if this is the case, GENESPACE will set end = Inf. However, from above, it looks like the names are correct.

My guess is that something is going on here that I have not anticipated ... can you run it with no other parameters other than highlightBed = roi_test. Does this still produce the problematic result? What about increasing the end of highlightBed to 1Mb.

Thanks for helping debug. JL

dmacguigan commented 3 months ago

Thanks for taking a closer look at this @jtlovell. I think I've isolated the issue a bit more.

When I run plot_riparian with all defaults and only one ROI, it highlights a portion of the chromosome.

However, chromosome NC_059344.1 is 69,199,620 bp in length and I'm only trying to highlight the first 100,000 bp in this test. The highlighted appears far larger than 100,000 bp.

> roi_test_4
   genome         chr start   end
11   Masi NC_059344.1     0 1e+05

> # test with defaults
> ripDat <- plot_riparian(
+   gsParam = out,
+   highlightBed = roi_test_4,
+   pdfFile="./test10.pdf")

image

Also, when I include a few other ROIs on the same chromosome, plot_riparian now highlights the entire chromosome.

> roi_test_3
   genome         chr  start    end
1    Masi NC_059344.1  21214  70543
2    Masi NC_059344.1 338474 417339
11   Masi NC_059344.1      0  1e+05

> # test with defaults
> ripDat <- plot_riparian(
+   gsParam = out,
+   highlightBed = roi_test_3,GENESPACh
+   pdfFile="./test9.pdf")

image

jtlovell commented 2 months ago

Thanks for working on this ... I'm sorry, but I don't know what can be causing it. I'd be happy to debug with your data, but won't be able to get to it until August. If that works for you, send me an email: jlovell [at] hudsonalpha [dot] org