fgvieira / ngsLD

Calculation of pairwise Linkage Disequilibrium (LD) under a probabilistic framework
GNU General Public License v2.0
45 stars 7 forks source link

ERROR: [read_dist] invalid distance between adjacent sites! #23

Closed avnitkr closed 3 years ago

avnitkr commented 3 years ago

Greetings!

I am using ngsLD to visualise LD blocks using whole genome sequences of an advanced recombinant population.

I am using the following command: NGSLD=/home/uqakaur3/ngsLD/

run ngsLD

$NGSLD/ngsLD --geno /30days/uqakaur3/recombinantpop/ngsld/ngsld_chreleven.beagle.gz --probs --n_ind 131 --n_sites 45946 \ --pos /30days/uqakaur3/recombinantpop/ngsld/ngsld_sites_chreleven.mafs \ --n_threads 15 --max_kb_dist 0 --max_snp_dist 0 --out ngsLD_chr11.ld --extend_out

I received an error:

Input Arguments: geno: /30days/uqakaur3/recombinantpop/ngsld/ngsld_chreleven.beagle.gz probs: true log_scale: false n_ind: 131 n_sites: 45946 pos: /30days/uqakaur3/recombinantpop/ngsld/ngsld_sites_chreleven.mafs (WITHOUT header) max_kb_dist (kb): 100 max_snp_dist: 0 min_maf: 0.001000 ignore_miss_data: false call_geno: false N_thresh: 0.000000 call_thresh: 0.000000 rnd_sample: 1.000000 seed: 1620279247 extend_out: true out: ngsLD_chr11.ld (WITHOUT header) n_threads: 15 verbose: 1 version: 1.1.1 (Apr 19 2021 @ 14:29:16)

==> GZIP input file (not BINARY)

Reading data from file... Header found! Skipping line... ==> Calculating MAF for all sites... ==> Getting sites coordinates

===== ERROR: [read_dist] invalid distance between adjacent sites!

    : Numerical result out of range

Any suggestion would be helpful.

avnitkr commented 3 years ago

--max_kb_dist 100 ** in the script.

fgvieira commented 3 years ago

That sounds like that you have adjacent sites with, either the same coordinates or not sorted. Can you check if in your pos file all positions are sequential?

avnitkr commented 3 years ago

Thanks Filipe. My pos file was not sequential. I corrected it and was able to generate .ld file having correlation values. I was trying to use your script LD_blocks.sh to plot LD blocks.

cat ngsLD_test.ld | bash /home/uqakaur3/ngsLD/scripts/LD_blocks.sh SPDCN1KCT_10481 10000 80000

Here, SPDCN1KCT_10481 represents one contig. My analysis is based on region file(rf) with list of contigs. Is there a way I can plot LD blocks for my list of contigs?- considering my .ld file has correlations from the provided list of contigs.

Thanks. Cheers, Avneet

avnitkr commented 3 years ago

Also , when i run the above command, cat ngsLD_test.ld | bash /home/uqakaur3/ngsLD/scripts/LD_blocks.sh SPDCN1KCT_10481 10000 80000 I receive the following error- 63 SNPs found! /home/uqakaur3/ngsLD/scripts/LD_blocks.sh: line 24: R: command not found

I am working on computer cluster and tried a few options to troubleshoot it (google suggestions). Not sure why hpc is not able to R command in the script. Any suggestion would be helpful.

Cheers, Avneet

fgvieira commented 3 years ago

Good to hear that now it works, but keep in mind that the order of the sites in the pos file needs to be the same that in the beagle file!!

Then you'd have to run the command for each contig. The easiest would prob be to make a for loop in bash.

Do you have R installed? Maybe you need to load some modules?

avnitkr commented 3 years ago

Thanks Filipe. I will keep that in mind. Yes R is installed, prob its hpc issue. Thanks for your help.

Cheers, Avneet