Issue with ssimp_chunks.sh

ssimp_chunks.log Dear Zoltan,

An issue was raised while playing with ssimp_chunks.sh, indicating "Error in paste(X1, X2, sep = ":") : object 'X2' not found" (line 80 in the attached log file). It seems this was caused by the failure to read the reference file containing chr/pos (see the lines 41-53). The file downloaded from "https://drive.switch.ch/index.php/s/uOyjAtdvYjxxwZd/download" should be "database.of.builds.1kg.uk10k.hrc.2018.01.18.bin". Although the script aims to save it as a gzipped file ("reference_panels/dbsnp_hg20_chr_pos_sorted.txt.gz"), it's not a zipped file and cannot be read by read_tsv. This resulted in the non-existence of the object "tkg" and the subsequent error.

Will it be possible to provide the correct file "dbsnp_hg20_chr_pos_sorted.txt.gz"?

Instead of using the above reference file, an alternative way I guess is to use the gz files in ~/reference_panles/1000genomes/? However, it requires the modification of the last part of the ssimp_chunks.sh as below: for (CHRM in 1:22){ file <- "~/reference_panels/1000genomes/ALL.chr{CHRM}.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz"

cat("Start: loading large file\n") tkg <- read_tsv(file, col_names = TRUE) ## this can take some time cat("Finished: loading large file\n")

columns

X1 = chr

X2 = pos - hg20

remove MT and PAR and Y and X

tkg <- filter(tkg, X1 %in% 1:22)

create file with ssimp chunks

----------------------------------

sessionInfo()

impute.range <- uk10k.chunks.from.to(ref.file=tkg, nbr.chunks = nbr.chunks) ## returns "chr:pos.start-chr:pos.end"

print.chunks(ssimp.args = ssimp.args, nbr.chunks = nbr.chunks, ref.file = tkg, out.name=out.name) cat("cat2\n")

}

Regards, patrick

zkutalik / ssimp_software