Closed shesanIsland closed 1 year ago
I assume you configured your genome properly and reference genome name can be seen by your ls
command.
Using reference genome: agamp4 [ GC: yes, mask: yes ]
But your rd histogram is empty
Chromosomes with RD histograms [bin sizes]: []
Could you please try a different bin size like 1000 instead of 100 and then using ls
command check this Chromosomes with RD histograms [bin sizes]:
field. You will see something like this if you processed it using 1000 bin size
Chromosomes with RD histograms [bin sizes]: [1000]
Thank You Arijit
Thanks for the reply. I receive the same output as my initial issue even after changing bin size to 1000.
Could you please check if your reference genome configuration is correct or not. There are two ways to add a new reference genome.
-conf
parameterI assume you followed the second way, as there is no -conf
parameter in the shared commands. Could you please confirm it.
Also, for the rd
command, please remove the -partition 100 -call 100
parameters. It should be something like this
cnvpytor -root /filepath/filename.pytor \
-rd /filepath/filename.bam \
-chrom backbone, transgene, AgamP4_2L, AgamP4_2R, AgamP4_3L, AgamP4_3R, AgamP4_X
then run the rest of the command.
cnvpytor -root /filepath/filename.pytor -stat 100
cnvpytor -root /filepath/filename.pytor -his 100
cnvpytor -root /filepath/filename.pytor -partition 100
cnvpytor -root /filepath/filename.pytor -call 100
cnvpytor -root /filepath/filename.pytor -view 100
If the reference genome is provided using-conf
parameter. For each command, you need to add -conf
parameter..
Yes, I've also created the .pytor file without the partition and call parameters and had the same result.
A ref_genome_conf.py file was created with the second method:
import_reference_genomes={
"agamp4":{
"name": "AgamP4",
"species": "Anopheles Gambiae",
"chromosomes": OrderedDict(
[("backbone", (12339, "A")), ("transgene", (8299, "A")),
("AgamP4_2L", (49364325, "A")), ("AgamP4_2R", (61545105, "A")),
("AgamP4_3L", (41963435, "A")), ("AgamP4_3R", (53200684, "A")), ("AgamP4_UNKN", (42389979, "A")), ("AgamP4_X", (24393108, "S")), ("AgamP4_Y_unplaced", (237045, "S"))]),
"gc_file":"/filepath/AgamP4_gc_file.pytor",
"mask_file": "/filepath/AgamP4_mask_file.pytor"
}
}
Masked file created with:
cnvpytor -root /filepath/AgamP4_mask_file.pytor -mask /filepath/genome.fa -make_mask_file
The bam file headers were previously edited alongside the chromosome names in the reference genome file to remove '()' characters. I have made sure they match so I assume this is not causing an issue.
Do you have P
bases in /filepath/genome.fa
file that you used to created the AgamP4_mask_file.pytor
if not, please try without the mask file.
Otherwise, will you able to share the AgamP4_mask_file.pytor
, AgamP4_gc_file.pytor
, AgamP4_mask_file.pytor
via email ? I can check the files.
Hello,
I simply decided to start again by removing the conda environment and reinstalling. I have successfully used the stats, partition and call parameters with -conf
, however, I have come up with another error.
-ls
gives the following:
Filename '/filepath/filename.pytor'
----------------------------------------------------------------------------------------
File created: 2023-03-07 16:38 using CNVpytor ver 1.2.1
Chromosomes with RD signal: backbone, transgene, AgamP4_2L, AgamP4_2R, AgamP4_3L, AgamP4_3R, AgamP4_X
Chromosomes with SNP signal:
Using reference genome: agamp4 [
Chromosomes with RD histograms [bin sizes]: transgene, AgamP4_2L, AgamP4_2R, AgamP4_3L, AgamP4_3R, AgamP4_X, backbone [100]
Chromosomes with SNP histograms [bin sizes]: []
Chromosome lengths: {'backbone': '12339', 'transgene': '8299', 'AgamP4_2L': '49364325', 'AgamP4_2R': '61545105', 'AgamP4_3L': '41963435', 'AgamP4_3R': '53200684', 'AgamP4_X': '24393108'}
However the cnv/ read depth is missing from the manhattan plot when I use:
cnvpytor -conf /filepath/ref_genome_conf.py -root /filepath/filename.pytor -view 100
set rd_use_mask
print calls
# set panels rd
set chrom backbone transgene AgamP4_2L AgamP4_2R AgamP4_3L AgamP4_3R AgamP4_X
manhattan
save /filepath/plotname.png
quit
I simply get an empty plot. Depending on other parameters that have been used, sometimes a message pops up explaining :
opt/miniconda3/envs/cnvpytor_env/lib/python3.7/site-packages/cnvpytor/viewer.py:1997: UserWarning: Attempting to set identical left == right == -0.0 results in singular transformations; automatically expanding. ax.set_xlim([-l 0.0, (l - 1) 1.0])
I have also tried :
cnvpytor -conf /filepath/ref_genome_conf.py \
-root /filepath/filename.pytor \
-plot manhattan 100 \
-chrom "backbone", "transgene", "AgamP4_2L", "AgamP4_2R", "AgamP4_3L", "AgamP4_3R", "AgamP4_X" \
-o /filepath/plotname.png
This only gave a plot with overlapping chromosome names...
Lastly, is it possible to merge regions ie find the average copy number of each chromosome? I am trying to compare the transgene with the reference. Thanks
You are still having empty Manhattan plot. Right? The reference genome section probably has an issue. See this section in ls
result.
Using reference genome: agamp4 [
If you have used mask file, it should be like agamp4 [ GC: yes, mask: yes ]
if, not it will be like agamp4 [ GC: yes, mask: no ]
For the warning,
opt/miniconda3/envs/cnvpytor_env/lib/python3.7/site-packages/cnvpytor/viewer.py:1997: UserWarning: Attempting to set identical left == right == -0.0 results in singular transformations; automatically expanding. ax.set_xlim([-l 0.0, (l - 1) 1.0])
This warning is due to the visualization library and not creating any issue for now, Please ignore it for now. We will try to address it in the coming releases.
Lastly, is it possible to merge regions ie find the average copy number of each chromosome? I am trying to compare the transgene with the reference.
The question is clear to me. As per my understanding, you are trying to find out if the contig called transgene
has any copy number variation or not. Am I right? If yes, you can find that by cnv calls or by looking at the Manhattan plot.
Odd how that happens despite seeing the following information when importing ref_genome_conf.py, rd, .bam, etc.:
cnvpytor.bam - INFO - Detected reference genome: agamp4
Yes to your question. My workaround was simply adding the ref_genome_conf.py
code to the genome.py script. I now have Using reference genome: agamp4 [ GC: yes, mask: yes ]
As per my understanding, you are trying to find out if the contig called transgene has any copy number variation or not. Am I right? If yes, you can find that by cnv calls or by looking at the Manhattan plot.
Yes to your question, however, I am still getting an empty manhattan plot when using:
cnvpytor -conf /filepath/ref_genome_conf.py -root /filepath/filename.pytor -view 100
set style bmh
set rd_use_mask
print calls
set panels rd
set chrom backbone transgene AgamP4_2L AgamP4_2R AgamP4_3L AgamP4_3R AgamP4_X
manhattan
Result:
The mask file has no P bases and the set rd_use_mask
is used.
CNVpytor
Done:
-ls
provides the following:Issues:
Have tried:
cnvpytor -root /filepath/filename.pytor -stat 100
cnvpytor -root /filepath/filename.pytor -his 100
cnvpytor -root /filepath/filename.pytor -partition 100
cnvpytor -root /filepath/filename.pytor -call 100
cnvpytor -root /filepath/filename.pytor -view 100
-v debug
doesn't show or highlight any errors...print calls
in view doesn't show any calls..Many thanks,