Closed aminakur closed 1 year ago
Hi!
In your case you need to use calcChromBins
function. The functions with "ref" suffix are available only for human and mouse.
The process is following:
1) calculate the chromosome sizes:
chrom_sizes = getChromSizesFromFasta("~/Downloads/azucena.fna")
2) calculate equally sized bins across chromosomes (you can change the number or size of the bins):
bins = getGenomeBins(chrom_sizes)
3) pass the bins to the calcChromBins
function:
x = calcChromBins(granges, bins)
To plot the distribution:
plotChromBins(x)
Please let me know if you run into more issues or if you need assistance with some other functions.
Best, Kristyna
Thank you! This worked. I have two more questions:
Glad that helped! 1) Sorry, that was my mistake. Current version of GenomicDistributions allows only binCount parameter. But you can always check the size of the bins straight by looking at the bins object and adjust the binCount accordingly to fit your desired bin width.
2) You can extract that from you calculated x
object. The calcChromBins
function returns a table, where the first three columns are bin coordinates, then there are two columns (regionID, withingGroupID) which just allow positioning of the bins for plotting, then there is column N, which has the information about the number of regions falling into a given bin (if you are running the function on multiple datasets, there is also an additional column name, which distinguishes the input queries).
To get the bins with the highest number of regions, you can sort the table as following and the bin with the highest number will be in the first row of the outputted table:
library(tidyverse)
highestSort = x %>%
arrange(desc(N))
To arrange from lowest to highest:
LowestSort = x %>%
arrange(N)
Fyi, you might also be interested in following vignette that was part of a Bioconductor 2022 workshop, which contains more specific instructions (in the second part of the workshop), how to work with custom genomes and how to edit plots.
Thank you!
I am trying to use calcChromBinsRef to calculate the distribution of features in my custom reference genome. I first loaded the ref genome using:
azucena <- read.fasta("~/Downloads/azucena.fna")
Then:
I calculated the chromosome sizes using getChromSizesFromFasta:
chrom_sizes = getChromSizesFromFasta("~/Downloads/azucena.fna")
But received the same error message.
Next, I tried calcChromBinsRef with chrom_sizes:
So how should I use my reference genome exactly?