Closed argrs-mtjs closed 9 months ago
Would you be willing to share the dataset, since it's only a few SNPs, so we can inquire into the error? The first does not seem related to docker.
It is the out.ff.vcf that you provide in https://github.com/Gabaldonlab/jloh/tree/master/test_data. I was just testing if everything was ok before proceeding to my dataset.
I suspected that the first error could be unrelated to docker...
We'll look into that. Perhaps try with another VCF you have lying around, see if the same thing happens.
I tested with one of my VCF and got the same error. I will wait for further insights from you
Could you tell me how many variants are in your VCF, and if you have any chromosome with no variants?
Sure! I have 87129 SNPs in my VCF and all the chromosomes have variants.
Besides giving the error I reported above, jloh seems to be able to read the VCF almost as it should. I noticed a small discrepancy in the number of het SNPs (it should be 81905), whereas the number of homo SNPs is correct.
[Thu Jan 25 15:35:44 2024] Reading SNPs
[Thu Jan 25 15:35:44 2024] found 81840 het SNPs and 5224 homo SNPs
[Thu Jan 25 15:35:44 2024] Reading chrom lengths from VCF header
[Thu Jan 25 15:35:44 2024] Read 9 chromosome names and their lengths
[Thu Jan 25 15:35:44 2024] Calculating heterozygous SNP densities
The discrepancy could be due to the default values of --min-af
and --max-af
. As for the rest, I'm going to keep investigating.
The above commit should fix the issue wtih reshape2
, which was derived by something on their side.
Could you give me your exact command to launch the test dataset? Might help.
Thank you for the commit. It fixed the first problem with the "jloh stats". The "jloh plot" now gives an error with the package "hash":
import pandas as pd
INFO: Pandarallel will run on 12 workers.
INFO: Pandarallel will use standard multiprocessing data transfer (pipe) to transfer data between the main process and workers.
[Tue Jan 30 15:18:14 2024] Reading input information
[Tue Jan 30 15:18:14 2024] Quantizing heterozygosity in windows of 10000 bp
Parsing rows: 100.0%
[Tue Jan 30 15:18:14 2024] Sorting by genome coordinate
[Tue Jan 30 15:18:14 2024] Quantizing intervals in windows of 10000 bp
Parsing rows: 100.0%
[Tue Jan 30 15:18:14 2024] Sorting by genome coordinate
[Tue Jan 30 15:18:14 2024] Writing table to output
[Tue Jan 30 15:18:14 2024] Plotting
Plot command that was run:
Rscript /root/src/jloh/src/scripts/loh-bin-plots_one-ref.Rscript jloh_out/plot.LOH_rate.tsv jloh_out/plots by_chromosome /input/jloh.LOH_blocks.tsv 0.35,2000,750,250 REF,ALT \#F7C35C,\#EF6F6C,\#64B6AC,\#ffffff no plot max
Error in library(hash) : there is no package called ‘hash’
Execution halted
The exact command for the plot is:
docker run -v $PWD:/input -t -i --rm cgenomics/jloh jloh plot --one-ref --loh /input/jloh.LOH_blocks.tsv --het /input/jloh.exp.het_blocks.bed --contrast max
Hi @argrs-mtjs , we have issued a new release that fixes the problem you brought up: https://github.com/Gabaldonlab/jloh/commit/cd40daed6e5e7308e67771b50b0af9ac3939217e.
You will have to rebuild the docker image.
Thanks for pointing it out, we hope that now it works. We have tested it and it works fine for us.
Hi! I'm trying to run jloh using docker, but I am running into some errors when following instructions
(https://jloh.readthedocs.io/en/latest/usage/run_test_data.html)
for the test dataset provided.For the "jloh stats" command, I get:
For the "jloh extract", everything runs as expected.
And for the "jloh plot", I get:
Can you please advise? Thanks in advance