Closed mike2vandy closed 6 years ago
There was a problem whereby only the coordinate was used to distinguish if we had skipped a line. This was fine as long as the coordinate wasn't the same when switching over chromosomes. I have since included the index of the chromosome. Thank you for your report!
Intersect requires all sites to be defined. Union does not. Say file 1 has data for the following coordinates:
103 104 105
and for the same chromosome, file 2 has:
103 104 106
intersection will contain:
103 104
union will contain:
103 104 105 106
However, for the site at position 105, since it is only defined in file#2, the allele count column for file#1 will just be a bunch of 0,0:0. This is done to account for missing data.
I re-downloaded and reran union without a problem. Thanks for the help.
no, thank you for reporting this.
So, I just download glactools and I'm working through the pipeline to convert bam to plink (bam2acf, union, acf2bplink).
At union I'm getting this error. Why would this error be thrown?
glactools union ERR484729.acf ERR490277.acf > test2.acf
sanityCheck()1 Chromosomes differ between FLZR01000063.1 740 A,N 0,0:0 0,0:0 1,0:0 and FLZR01000063.1 740 A,N 0,0:0 0,0:0 1,0:0
Also, I know it's a FAQ, but I still don't understand the difference between a defined site (intersect) and undefined site (union), because I can get intersect to work without issues.
Thanks, Mike