vplagnol / ExomeDepth

ExomeDepth R package for the detection of copy number variants in exomes and gene panels using high throughput DNA sequencing data.
59 stars 26 forks source link

Error in creating vignette #52

Open skgs1970 opened 1 year ago

skgs1970 commented 1 year ago

I downloaded it from the GitHub Exomedepth and got all dependencies sorted out. now I am having an issue with the processing of the ExomeDepth vignette.

It gives me the following errors. Can it be sorted out.

These are the last lines when I try make build. ############################################### R CMD build --resave-data working/ExomeDepth

SUMMARY: processing the following file failed: ‘ExomeDepth-vignette.Rnw’

Error: Vignette re-building failed. Execution halted make: *** [Makefile:7: build] Error 1

I really appreciate any help you can provide.

vplagnol commented 1 year ago

Can you try now? It works for me nicely but I have been doing some work on the package so it is plausible that you picked up the package at the wrong time (sorry for messing with the master branch).

skgs1970 commented 1 year ago

Dear Dr Plagnol, Thanks for your reply. It works well now. The problems were in compiling Exomdepth. The R packages devtools and pkgdown needed some Linux libraries and they were creating some issues. I compiled it in the latest Ubuntu22.04.1 LTS. I had to change the Renviron file from using texi2dvi to R_TEXI2DVICMD=emulation. It compiled properly and seems to be working now.

I have generated the Exons38 file from the GCA_000001405.15_GRCh38_full_analysis_set.refseq_annotation.gff.gz. If it is useful to you and the Exomedepth user community, I can send the prepared Exons38 file along with the bash scripts used to create it. It seems to be working fine, we are trying to see how it compares with the latest GATK CNV scripts. Thanks again for your help.

Regards skgs1970

On Sat, Oct 29, 2022 at 10:04 PM Vincent Plagnol @.***> wrote:

Can you try now? It works for me nicely but I have been doing some work on the package so it is plausible that you picked up the package at the wrong time (sorry for messing with the master branch).

— Reply to this email directly, view it on GitHub https://github.com/vplagnol/ExomeDepth/issues/52#issuecomment-1295893808, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCQOZOZD2Z7JHDTJM4GYNLWFVGYXANCNFSM6AAAAAARQ5P27M . You are receiving this because you authored the thread.Message ID: @.***>

vplagnol commented 1 year ago

Thanks. First of all the package is now back on CRAN, so I can think of adding other bits. Do you want to send me a pull request with the relevant file? My concern is that the hg19 file is already almost 2 Mb and adding another one of these may make the package rather heavy. I suspect there is a better way to make these annotation files available, perhaps through an address on github but not part of the package itself.

skgs1970 commented 1 year ago

Dear Dr Plagnol. The package is working well now and I also updated it to the latest from CRAN. Your suggestion of making the files available separately is logical. I have been using your R script (with the Exons38) in two parts- 1)Part one makes a reference set from a set of bam files and stores it as a CSV. 2)second part calls the CNVs in the samples. You had said that it generates about 170 -200 CNVs. I have been getting only about 100 CNVs from the Exons38

My current work is on trios and quads that have at least one or two diseased individuals. We made two sets of reference files, one with 36 unrelated samples from the same sequencing experiments and the other with 72 samples. These are parents of affected children and non-consanguinous couples. Our samples are all from INDIA. I have been checking the used reference sets from the output of mychoice. It takes anywhere between 5 to 14 samples in the mychoice reference set out of 36 or 72 reference samples. There are a few samples which are preferred in about 70% of the time calling CNVs from samples. Do you think, it may have something to do with the lesser number of CNVs?

As for the Exons38 file. I have generated it from the RefSeq MANE set and it contains 19062 genes. I can send the relevant file and also the bash codes I used to generate them. You can send me the pull request and I will send both files. I tried initially with the GCA_000001405.15_GRCh38_full_analysis_set.refseq_annotation.gff.gz, but it has lots of putative genes, LOC, MIRs etc.. but then stuck to Refseq MANE set. The Refseq MANE set is updated every few months.

Thanks for your time.