weiyuchung / CONY

CONY Program
4 stars 2 forks source link

Problems with CONY dependencies #1

Open MikiSchikora opened 3 years ago

MikiSchikora commented 3 years ago

Good evening,

Thanks for developing this tool. I am struggling to use it because the latest versions of the dependencies r, IRanges, snow and ExomeCopy do not fit CONY. Could you give more details about which versions of these libraries were used in the testing?

Thanks.

Miquel Àngel Schikora

MikiSchikora commented 3 years ago

Good evening,

After trying several versions, I was able to partially run the software following the section of the manual "Detect absolute copy number for single sample analysis". I am running this on budding yeast samples. All the functions write the expected outputs, until UsedRD, which writes three empty files. I can't follow on further steps. There are no errors or warnings to trace.

I am using the following versions of the software:

I also note that I ran bwa mem to get the bam file followed by the following command:

samtools mpileup -f chromosome_1.fa --min-MQ 30 --min-BQ 30 -a chromosome_1.sorted.bam | cut -f2,4 > chromosome_1.mpileup.txt

This is the mpileup that I use in the CONY pipeline.

Do you have any suggestion on how to solve this? Maybe I am not using the correct dependencies.

Thanks,

Miquel Àngel Schikora

piyalkarum commented 1 year ago

I have the same issue with RangedData function of IRanges. Were you able to solve this issue?

MikiSchikora commented 1 year ago

Hi,

Yes, I finally went to debug the code myself and I could it run. I actually wrote perSVade, a pipeline to call various types of variants directly from short reads or sorted bam files. For CNV calling it integrates the outputs of CONY, AneuFinder and HMMcopy (see module 'call_CNVs' of perSVade). You may try it. Btw, I found that, while AneuFinder and HMMcopy work generally well, CONY had some issues with certain window lengths or genomes.

Within the scripts folder, there is some code that may be relevant to you. First, 'call_CNVs' is a python wrapper that runs CNV calling based on the bam file. Second, 'run_CONY.R' is a script that does the runnning, based on 'CONY_package_debugged.R' which is the debugged source code.

If you have trouble running perSVade, let's discuss it through an issue there.

I hope this helps,

Miquel Àngel Schikora

piyalkarum commented 1 year ago

Hi Miquel,

Thank you very much for the quick reply. The authors probably gave up on this. :D I tried replacing RangedData with GRanges from GenomicRanges package because it's the replacement for RangedData. But they aren't the same and did not work. I will check your scripts. Do your scripts work on long-read sequences as well?

Best, Piyal

MikiSchikora commented 1 year ago

Hi,

So perSVade is designed to work with short reads, and the SV and small variant calling features certainly are tailored to short reads. However, you may generate a sorted bam based on long reads and pass it to call_CNVs, which should in theory work. call_CNVs uses mosdepth to calculate coverage from the bam, which should work with long reads, although I am not sure how fast it will be.

I hope this helps,

Miquel Àngel Schikora

piyalkarum commented 1 year ago

Hi,

Thanks for the explanation! currently I'm sorting out programs/scripts to call CNVs from long-reads and CONY says it can handle long-reads because it is window-based. I will try your scripts on a couple of samples. Please let me know if you have other suggestions.

Best, Piyal