jtlovell / GENESPACE

Other
192 stars 27 forks source link

Subset ref genome chrs in riparian plot #104

Closed kushalsuryamohan closed 1 year ago

kushalsuryamohan commented 1 year ago

Hello, Thank you for creating such a fantastic tool!

I'm trying to create a riparian plot wherein I'd like to subset and reorder chromosomes/scaffolds in one species (this is the reference genome and has 19 chromosome-level scaffolds) but I'm running into trouble when I use the 'customRefChrOrder' parameter to reorder chromosomes.

I tried subsetting the data first by using the highlightBed option but that gives a very unhelpful error log.

Here's the code/call to the riparian plotting function:

load('/MG/SHARED/ANALYSIS/DEMUX/GENOME_ANALYSIS/snakes_genespace/results/gsParams.rda',verbose = TRUE)
ggthemes <- ggplot2::theme(
  panel.background = ggplot2::element_rect(fill = "white"))
ripDat <- plot_riparian(
  gsParam = gsParam,
  refGenome = "Naja_naja",
  genomeIDs = c("Naja_naja", "N_nigricollis", "Greenmamba", "P_colletti", "krait"),
  forceRecalcBlocks = FALSE,
  gapProp = 0.02,
  scaleBraidGap = 2,
  scaleGapSize = .5,
  howSquare = 2,
  chrFill = "lightgrey",
  reorderBySynteny = FALSE,
#reorder scaffolds based on size
 customRefChrOrder = c("ScVE01q_1090;HRSCAF=1250","ScVE01q_1317;HRSCAF=1506","ScVE01q_965;HRSCAF=1107", "ScVE01q_3;HRSCAF=25", "ScVE01q_1539;HRSCAF=1758","ScVE01q_1725;HRSCAF=1971","ScVE01q_1377;HRSCAF=1575","ScVE01q_1465;HRSCAF=1671","ScVE01q_1043;HRSCAF=1199","ScVE01q_1;HRSCAF=4","ScVE01q_221;HRSCAF=275","ScVE01q_1635;HRSCAF=1871","ScVE01q_2;HRSCAF=16","ScVE01q_533;HRSCAF=621","ScVE01q_1130;HRSCAF=1298","ScVE01q_807;HRSCAF=927","ScVE01q_1131;HRSCAF=1299","ScVE01q_768;HRSCAF=884","ScVE01q_6;HRSCAF=32"),
  addThemes = ggthemes)

Here's the error log:

GENESPACE v1.2.3: synteny and orthology constrained comparative genomics Error in riparian_engine(blk = blksTp, bed = bed, refGenome = refGenome, : customRefChrOrder but not all refchrs are in this list Calls: plot_riparian -> riparian_engine Execution halted

The reference genome has ~1800 scaffolds - I wonder if this is the reason for the error?

However, when I tried to create a riparian plot without trying to force the order of the scaffolds using the code below, it seems to be working (see attached plot).

load('/MG/SHARED/ANALYSIS/DEMUX/GENOME_ANALYSIS/snakes_genespace/results/gsParams.rda',verbose = TRUE)
ggthemes <- ggplot2::theme(
  panel.background = ggplot2::element_rect(fill = "white"))
ripDat <- plot_riparian(
  gsParam = gsParam,
 # highlightBed = roi,
  refGenome = "Naja_naja",
  genomeIDs = c("Naja_naja", "N_nigricollis", "Greenmamba", "P_colletti", "krait"),
  forceRecalcBlocks = FALSE,
  gapProp = 0.02,
  scaleBraidGap = 2,
  scaleGapSize = .5,
  howSquare = 2,
  chrFill = "lightgrey",
  reorderBySynteny = FALSE,
  addThemes = ggthemes)

Can you please help solve this problem? I'd like to reorder this figure in decreasing order of scaffold size of the reference genome.

Thanks!

Rplots.pdf

jtlovell commented 1 year ago

hmm. genespace is telling you that some of the chromosomes listed there are not named in the bed file of the reference genome. Could this be? Maybe check by reading in the bed file and making sure that there is no miss-matches. Regardless, pls can you confirm that this isn't the case? It may also be that GENESPACE doesn't want special characters in the chromosome names in this particular situation. If this is the case, then this is a bug that I'll need to fix. LMK