ANGSD / angsd

Program for analysing NGS data.
229 stars 50 forks source link

-doSaf produces unsorted positions and mixes up chromosomes #258

Closed martabe closed 4 years ago

martabe commented 5 years ago

Hi everybody,

I've been trying to calculate Fst between two populations, but I got stuck at the first step: obtaining the saf files.

I start with a vcf file (coming from GATK and a subsequent filtering through bcftools), and I use ANGSD -doSaf. This step seems to go fine, but later on realSFS complains that my file is not sorted. The minimal example I could propose is actually not minimal, because if I subset my vcf into a smaller region I don't get the error any longer. I can reproduce the error on single sample and multi-sample vcfs though.

Here are my commands:

angsd -vcf-gl myvcfOnesample.vcf -doSaf 1 -out myvcf -P 4 -fai mygenome.fasta.fai -anc mygenome.fasta -nInd 1 -doMajorMinor 1 -fold 1
realSFS check myvcf.saf.idx
realSFS print myvcf.saf.idx > myvcf.saf.idx.table

angsd runs with no errors in STDERR. realSFS check reports a list of:

problems with unsorted saf file chromoname: 'PeexChr4' pos[7165]:10881 vs posd[7165-1]:159957354

When I check in myvcf.saf.idx.table I see that indeed the positions on PeexChr4 (and later) are not sorted correctly, eg:

PeexChr4        159957355
PeexChr4        10882

But actually, if I check in the vcf file, PeexChr4 has no position 159957355, but rather this is the last position of PeexChr3, while 10882 is the first position for PeexChr4.

I cannot figure out if I'm doing something wrong, and how to fix it. I'll appreciate any help and suggestion!

Marta

deboraycb commented 4 years ago

Hi,

I have the same error with angsd version: 0.921-8-gc12d2fa (htslib: 1.7-23-g9f9631d). With the newest 0.930 version I get an upstream error because my vcf cannot be read and I get the error reported here: #264 and #227 @martabe which version are you using?

my realsfs check tells me: -> problems with unsorted saf file chromoname: '10' pos[2]:125233 vs posd[2-1]:249224030

and first 2 columns in my realSFS print tells me:

1       249209140
10      249220687
10      249224031
10      125234
10      126070

which corresponds to my vcf lines:

1       249209140
1       249220687
1       249224031
10      125234
10      126070

I'd also appreciate any suggestions regarding this issue.

martabe commented 4 years ago

Hi,

I'm running version 0.921 (htslib: 1.6). I didn't try with a newer version so far, this one was the one provided by the cluster admins. Unfortunately I will not have time to try in the next few weeks, but I'll keep the post updated if I try.

Cheers, Marta

nspope commented 4 years ago

There were big changes to the vcf reader in a commit in Feb 2019, I would not advise using versions <0.929 for working with vcf.

Some further issues with the INDEL flag or with the PL field can be worked-around for now by making appropriate changes to the header, as in my comment to #227

deboraycb commented 4 years ago

Just in case someone else found a solution to this, I'm still getting this error with version 0.931

z0on commented 4 years ago

this might be very silly but when I get an error of this kind, before thinking of anything else I would check if my line endings are actually unix type (not mac or windows). I stepped on this specific rake more often than I care to admit.

Misha

On Jan 2, 2020, at 9:11 PM, Debora notifications@github.com wrote:

Just in case someone else found a solution to this, I'm still getting this error with version 0.931

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ANGSD/angsd/issues/258?email_source=notifications&email_token=ABZUHGFLMX6Q5XRWSZOS3RLQ32UH5A5CNFSM4I6GBIA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIAE6BQ#issuecomment-570445574, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGAN5K7NSDO4MAVQ53DQ32UH5ANCNFSM4I6GBIAQ.

martabe commented 4 years ago

Dear all,

Sorry for the very late response and thanks to everyone making efforts to solve the issue.

Now -doSaf only outputs one chromosome...

I tested a 1 sample, 7 chromosomes, 1000 sites per chromosome subset of my vcf, and the result is the same, only the first chromosome is included in the output.

These are the commands I used:

angsd -vcf-gl mytest.vcf -doSaf 1  -out mytestSaf -P 1 -nInd 1 -doMajorMinor 1 -anc Peex304.fasta -fold 1

realSFS check mytestSaf.saf.idx
realSFS print mytestSaf.saf.idx > mytestSaf.saf.idx.txt

I tested -doSaf with and without the -fai option, given the last comment to #268, but I see no difference in the output.

The realSFS check output does not mention any error, and in the print of the saf.idx I only have chromosome 1.

I can provide the files if needed (I'm not doing it now because I would have to subset the genome).

Any suggestion? Thanks, Marta

sandyplus commented 4 years ago

Hi, all I have the same error with angsd version: 0.930. I cannot get 2d SFS. the realSFS check results shows: problems with unsorted saf file chromoname: 'Qrob_H2.3_Sc0000124' pos[53307]:3086 vs posd[53307-1]:1639536 Any suggestions? Best regards, Sandy

martabe commented 4 years ago

Hi Sandy,

You could try the unsafe workaround I gave to #270, if you feel adventurous.

Best, Marta

ANGSD commented 4 years ago

This should be resolved in commit cea07b3, so I am closing this issue. But feel free to reopen og submit a new issue if needed.

Best

martabe commented 3 years ago

Dear all,

I've been using version 0.933 but I still have that problem. I'm using some fake files with only 4 sites, two per chromosome. The problem is the same.

Input file is a vcf (added INDEL line, passed through dos2unix, just in case) with only 4 SNPs: PeexChr1 224249438 PeexChr1 224249748 PeexChr2 149739 PeexChr2 149995

angsd0.933/angsd -vcf-PL test_pop1_indelinfo.vcf -doSaf 1 -fai Peex304test.fasta.fai -out test_pop1 -P 2 -nInd 18 -doMajorMinor 5 -anc Peex304test.fasta
angsd0.933/misc/realSFS print test_pop1.saf.idx > test_pop1.saf.idx.table

cat of test_pop1.saf.idx.table shows 4 sites all on chr2: PeexChr2 224249438 PeexChr2 224249748 PeexChr2 149739 PeexChr2 149995

Did I miss something? Does anyone still experience this problem?

Thanks a lot, Marta

PS: can send test files if useful.

ANGSD commented 3 years ago

Sure, the testfile would be useful. Including the header.

Best

On 4 Mar 2021, at 12.53, Marta Binaghi notifications@github.com wrote:

Dear all,

I've been using version 0.933 but I still have that problem. I'm using some fake files with only 4 sites, two per chromosome. The problem is the same.

Input file is a vcf (added INDEL line, passed through dos2unix, just in case) with only 4 SNPs: PeexChr1 224249438 PeexChr1 224249748 PeexChr2 149739 PeexChr2 149995

angsd0.933/angsd -vcf-PL test_pop1_indelinfo.vcf -doSaf 1 -fai Peex304.fasta.fai -out test_pop1 -P 2 -nInd 18 -doMajorMinor 5 -anc Peex304.fasta angsd0.933/misc/realSFS print test_pop1.saf.idx > test_pop1.saf.idx.table cat of test_pop1.saf.idx.table shows 4 sites all on chr2: PeexChr2 224249438 PeexChr2 224249748 PeexChr2 149739 PeexChr2 149995

Did I miss something? Does anyone still experience this problem?

Thanks a lot, Marta

PS: can send test files if useful.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/ANGSD/angsd/issues/258#issuecomment-790559053, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQOR3UBEVRFMGOPUBVQHLLTB5YCFANCNFSM4I6GBIAQ.

martabe commented 3 years ago

Test files: vcf (remove .txt) test_pop1_indelinfo.vcf.txt fai (remove .txt) Peex304test.fasta.fai.txt Genome is too big even if I remove the useless chromosomes. I can upload it somewhere else if you need. It's here (but for future readers, I will remove it in a week from now): https://drive.google.com/file/d/1Ic3E1fJOCroF61ZN66BCGoXXZVUo0Z0J/view?usp=sharing

Thanks! Marta

ANGSD commented 3 years ago

Hi Marta, I cant reproduce the error but i had to modifify the files so i could run it. These are shown here:

mbp:angsd user$ cat *.vcf

fileformat=VCFv4.2

FILTER=

ALT=

FILTER=60.0">

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=

FILTER=3.0">

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

FORMAT=

GATKCommandLine=<ID=CombineGVCFs,CommandLine="CombineGVCFs --output cohort1.g.vcf --variant 1.g.vcf --variant 2.g.vcf --variant 3.g.vcf --variant 4.g.vcf --variant 5.g.vcf --variant 6.g.vcf --variant 7.g.vcf --variant 8.g.vcf --variant 9.g.vcf --variant 10.g.vcf --variant 11.g.vcf --variant 12.g.vcf --variant 13.g.vcf --variant 14.g.vcf --variant 15.g.vcf --variant 16.g.vcf --variant 17.g.vcf --variant 18.g.vcf --variant 19.g.vcf --variant 20.g.vcf --reference /home/ubelix/ips/mbinaghi/hybrids/data/genomes/Peex304.fasta --convert-to-base-pair-resolution false --break-bands-at-multiples-of 0 --ignore-variants-starting-outside-interval false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --disable-tool-default-annotations false --enable-all-annotations false",Version=4.1.0.0,Date="15 July 2019 20:20:33 CEST">

GATKCommandLine=<ID=SelectVariants,CommandLine="SelectVariants --output hardfiltered_biallelic.vcf --restrict-alleles-to BIALLELIC --variant hardfiltered.vcf --reference /home/ubelix/ips/mbinaghi/hybrids/data/genomes/Peex304.fasta --add-output-vcf-command-line true --invertSelect false --exclude-non-variants false --exclude-filtered false --preserve-alleles false --remove-unused-alternates false --keep-original-ac false --keep-original-dp false --mendelian-violation false --invert-mendelian-violation false --mendelian-violation-qual-threshold 0.0 --select-random-fraction 0.0 --remove-fraction-genotypes 0.0 --fully-decode false --max-indel-size 2147483647 --min-indel-size 0 --max-filtered-genotypes 2147483647 --min-filtered-genotypes 0 --max-fraction-filtered-genotypes 1.0 --min-fraction-filtered-genotypes 0.0 --max-nocall-number 2147483647 --max-nocall-fraction 1.0 --set-filtered-gt-to-nocall false --allow-nonoverlapping-command-line-samples false --suppress-reference-path false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version=4.1.0.0,Date="August 6, 2019 9:34:37 PM CEST">

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

INFO=

contig=

contig=

reference=file:///home/ubelix/ips/mbinaghi/hybrids/data/genomes/Peex304.fasta

source=CombineGVCFs

source=SelectVariants

bcftools_viewVersion=1.10+htslib-1.10

bcftools_viewCommand=view -s 8,18,24,25,33,34,38,40,42,44,45,48,52,57,60,6,5,61 /home/ubelix/ips/mbinaghi/hybrids/data/raw/variants/hardfiltered_biallelic_cr09_mm005_vcfinfo_hardmask.recode.vcf; Date=Thu Mar 4 11:19:19 2021

INFO=

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 8 18 24 25 33 34 38 40 42 44 45 48 52 57 60 6 5 61

PeexChr1 1 . G A 918.05 PASS AC=0;AF=0.051;AN=36;BaseQRankSum=0;DP=373;ExcessHet=0.0251;FS=13.752;InbreedingCoeff=0.2098;MLEAC=7;MLEAF=0.051;MQ=60;MQRankSum=0;QD=30.6;ReadPosRankSum=0.967;SOR=0.762 GT:AD:DP:GQ:PL 0/0:7,0:7:21:0,21,274 0/0:3,0:3:9:0,9,113 0/0:9,0:9:27:0,27,375 0/0:7,0:7:21:0,21,258 0/0:2,0:2:6:0,6,88 0/0:4,0:4:12:0,12,150 0/0:7,0:7:21:0,21,266 0/0:6,0:6:18:0,18,224 0/0:5,0:5:12:0,12,180 0/0:7,0:7:21:0,21,266 0/0:4,0:4:12:0,12,141 0/0:2,0:2:6:0,6,62 0/0:2,0:2:6:0,6,90 0/0:2,0:2:6:0,6,83 0/0:7,0:7:21:0,21,254 0/0:3,0:3:9:0,9,128 0/0:8,0:8:24:0,24,333 0/0:7,0:7:21:0,21,280 PeexChr1 2 . T C 1166.66 PASS AC=2;AF=0.098;AN=32;BaseQRankSum=0;DP=225;ExcessHet=0;FS=0;InbreedingCoeff=0.3632;MLEAC=15;MLEAF=0.114;MQ=60;MQRankSum=0;QD=27.86;ReadPosRankSum=0.674;SOR=2.774 GT:AD:DP:GQ:PL 0/0:3,0:3:6:0,6,90 0/0:3,0:3:6:0,6,90 0/0:3,0:3:9:0,9,128 0/0:3,0:3:9:0,9,108 ./.:0,0:0:.:0,0,0 0/0:1,0:1:3:0,3,12 ./.:0,0:0:.:0,0,0 0/0:3,0:3:9:0,9,113 0/0:1,0:1:3:0,3,42 0/0:5,0:5:15:0,15,209 0/0:6,0:6:18:0,18,210/0:1,0:1:3:0,3,27 0/0:1,0:1:3:0,3,16 0/0:1,0:1:3:0,3,27 0/0:1,0:1:3:0,3,15 0/0:1,0:1:3:0,3,27 0/0:5,0:5:15:0,15,183 1/1:0,6:6:18:217,18,0 PeexChr2 1 . C T 14065.9 PASS AC=14;AF=0.413;AN=34;BaseQRankSum=0;DP=764;ExcessHet=0.006;FS=1.442;InbreedingCoeff=0.3319;MLEAC=63;MLEAF=0.5;MQ=52.55;MQRankSum=-0.674;QD=28.67;ReadPosRankSum=-0.353;SOR=0.583 GT:AD:DP:GQ:PL 0/0:8,0:8:0:0,0,78 0/0:15,0:15:0:0,0,364 0/0:2,0:2:6:0,6,83 0/1:6,14:20:99:570,0,210 0/0:6,0:6:0:0,0,96 1/1:0,4:4:12:180,12,0 0/1:8,8:16:99:312,0,1994 0/1:5,10:15:99:447,0,177 1/1:0,6:6:18:270,18,0/1:2,10:12:99:414,0,124 1/1:0,11:11:33:495,33,0 ./.:13,0:13:.:0,0,0 0/0:10,0:10:27:0,27,405 0/1:4,2:6:72:72,0,162 0/1:11,19:30:99:765,0,405 0/0:3,0:3:9:0,9,118 0/0:15,0:15:45:0,45,651 1/1:0,23:23:69:1035,69,0 PeexChr2 2 . C A 3490.9 PASS AC=4;AF=0.102;AN=34;BaseQRankSum=-0.842;DP=1016;ExcessHet=0;FS=1.357;InbreedingCoeff=0.8048;MLEAC=14;MLEAF=0.109;MQ=57.25;MQRankSum=-1.386;QD=28.52;ReadPosRankSum=1.04;SOR=0.479 GT:AD:DP:GQ:PL ./.:15,0:15:.:0,0,0 0/0:14,0:14:0:0,0,489 0/0:12,0:12:33:0,33,495 0/0:18,0:18:54:0,54,793 0/0:9,0:9:27:0,27,371 0/0:6,0:6:18:0,18,217 0/0:25,0:25:66:0,66,990 0/0:19,0:19:45:0,45,675 0/0:13,0:13:27:0,27,405 0/0:19,0:19:54:0,54,810 0/0:15,0:15:42:0,42,630 0/0:20,0:20:51:0,51,765 0/0:8,0:8:24:0,24,333 0/0:9,0:9:24:0,24,360 0/0:26,0:26:72:0,72,1080 1/1:0,6:6:18:270,18,0 1/1:0,20:20:60:893,60,0 0/0:25,0:25:60:0,60,900 mbp:angsd user$ cat anc.fa

PeexChr1 GT PeexChr2 CCmbp:angsd user$

mbp:angsd user$ ./angsd -vcf-pl test_pop1_indelinfo.vcf -domaf 1 -dosaf 1 -anc anc.fa

mbp:angsd user$ ./misc/realSFS print angsdput.saf.idx |cut -f1-2 -> Version of fname:angsdput.saf.idx is:2 -> Assuming .saf.gz file: angsdput.saf.gz -> Assuming .saf.pos.gz: angsdput.saf.pos.gz -> args: tole:0.000000 nthreads:4 maxiter:100 nsites:0 start:(null) chr:(null) start:-1 stop:-1 fstout:(null) oldout:0 seed:-1 bootstrap:0 resample_chr:0 whichFst:0 fold:0 ref:(null) anc:(null) -> Will jump to multisaf printer and will only print intersecting sites between populations -> dim(angsdput.saf.idx):37 -> Dimension of parameter space: 37 -> Done reading data from chromosome will prepare next chromosome -> Is in multi sfs, will now read data from chr:PeexChr1 -> hello Im the master merge part of realSFS. and I'll now do a tripple bypass to find intersect -> 1) Will set iter according to chooseChr and start and stop, and possibly using -sites -> Only read nSites: 0 will therefore prepare next chromosome (or exit) -> Done reading data from chromosome will prepare next chromosome -> Is in multi sfs, will now read data from chr:PeexChr2 -> hello Im the master merge part of realSFS. and I'll now do a tripple bypass to find intersect -> 1) Will set iter according to chooseChr and start and stop, and possibly using -sites -> Only read nSites: 0 will therefore prepare next chromosome (or exit) -> Done reading data from chromosome will prepare next chromosome -> Run completed PeexChr1 1 PeexChr1 2 PeexChr2 1 PeexChr2 2 mbp:angsd fvr124$

On 4 Mar 2021, at 13.54, Marta Binaghi notifications@github.com wrote:

Test files: vcf (remove .txt) test_pop1_indelinfo.vcf.txt https://github.com/ANGSD/angsd/files/6083723/test_pop1_indelinfo.vcf.txt fai (remove .txt) Peex304test.fasta.fai.txt https://github.com/ANGSD/angsd/files/6083727/Peex304test.fasta.fai.txt Genome is too big even if I remove the useless chromosomes. I can upload it somewhere else if you need.

Thanks! Marta

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/ANGSD/angsd/issues/258#issuecomment-790596418, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQOR3XG576VIMNY7PZG35TTB57GVANCNFSM4I6GBIAQ.

martabe commented 3 years ago

Hi,

Thanks for testing that. I think the reason why it might not have worked out of the box for you could be the timestamp on the genome index (I dowloaded the test files myself and had to touch the index for it to work).

Other than that, for some reason that I still haven't figured out, when I run the original files in their original folder I get the error, but if I move the file out to a new folder and run it again everything goes fine. So I don't know what is going on but it seems like I can run the software now.

Thanks a lot for taking the time to check, I really appreciated that.

Have a nice day, Marta

martabe commented 3 years ago

Hi, me again.

I understood what was causing the mix up. Option -P 2 causes mixed chromosome positions, option -P 1 does not. I work on a cluster and I request the CPUs according to the option in the command, but it seems like multithreading results in this behaviour with -doSaf. That also explains why I couldn't reproduce the error on the test files. I was requesting only one CPU given the files were small.

Marta

z0on commented 3 years ago

That's a highly non-trivial problem! Thanks a lot, Marta, for catching this

On Thu, Mar 11, 2021 at 9:17 AM Marta Binaghi @.***> wrote:

Hi, me again.

I understood what was causing the mix up. Option -P 2 causes mixed chromosome positions, option -P 1 does not. I work on a cluster and I request the CPUs according to the option in the command, but it seems like multithreading results in this behaviour with -doSaf. That also explains why I couldn't reproduce the error on the test files. I was requesting only one CPU given the files were small.

Marta

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ANGSD/angsd/issues/258#issuecomment-796811142, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUHGFTPCOHXDQIPAVBFKDTDDNHXANCNFSM4I6GBIAQ .

TeresaPegan commented 2 years ago

Hi all, I am still getting this problem. In my case it doesn't have anything to do with VCF files but the outcome is the same: -dosaf produces a SAF file with unsorting problems and I think it has to do with failing to detect the end of chromosomes. I am using a highly fragmented scaffold-assembled genome for the species I am getting the error with. (I have not gotten the error with other species with higher-quality genomes).

I am not using multithreading, so the issue martabe found is not the explanation in my case.

Here is my code for the safs:

$ANGSDIR/angsd -bam $POPLIST -ref $ANC -anc $ANC -out $OUT \
-dosaf 1 -GL 1 -uniqueOnly 1 -remove_bads 1 -only_proper_pairs 1 -minMapQ 30 -minQ 30 -C 50 -rf $REGFILE

This gives me a message saying

 !! You are doing -dosaf incombination with -rf, please make sure that your -rf file is sorted !!

The file is sorted, and I verified this by sorting it as suggested here.

The job finishes without errors. Then, I do this:

$ANGSDIR/misc/realSFS check $OUT.saf.idx

And I get this message:

    -> problems with unsorted saf file chromoname: 'VZRF01000187.1' pos[1822035]:22 vs posd[1822035-1]:2229692
    -> problems with unsorted saf file chromoname: 'VZRF01000187.1' pos[3644070]:22 vs posd[3644070-1]:2229692
    -> problems with unsorted saf file chromoname: 'VZRF01000586.1' pos[1522223]:112 vs posd[1522223-1]:1920221
    -> problems with unsorted saf file chromoname: 'VZRF01000586.1' pos[3044446]:112 vs posd[3044446-1]:1920221
    -> problems with unsorted saf file chromoname: 'VZRF01000927.1' pos[1777515]:360 vs posd[1777515-1]:2036024
    -> problems with unsorted saf file chromoname: 'VZRF01000927.1' pos[3555030]:360 vs posd[3555030-1]:2036024
    -> problems with unsorted saf file chromoname: 'VZRF01001110.1' pos[3144565]:18 vs posd[3144565-1]:3829545
    -> problems with unsorted saf file chromoname: 'VZRF01001110.1' pos[6289130]:18 vs posd[6289130-1]:3829545
    -> problems with unsorted saf file chromoname: 'VZRF01001586.1' pos[2177963]:4 vs posd[2177963-1]:2702387
    -> problems with unsorted saf file chromoname: 'VZRF01001586.1' pos[4355926]:4 vs posd[4355926-1]:2702387
    -> problems with unsorted saf file chromoname: 'VZRF01001720.1' pos[1967965]:140 vs posd[1967965-1]:2792947
    -> problems with unsorted saf file chromoname: 'VZRF01001720.1' pos[3935930]:140 vs posd[3935930-1]:2792947
    -> problems with unsorted saf file chromoname: 'VZRF01001859.1' pos[1540365]:68 vs posd[1540365-1]:2234032
    -> problems with unsorted saf file chromoname: 'VZRF01001859.1' pos[3080730]:68 vs posd[3080730-1]:2234032
    -> problems with unsorted saf file chromoname: 'VZRF01001867.1' pos[1499664]:311 vs posd[1499664-1]:1819316
    -> problems with unsorted saf file chromoname: 'VZRF01001867.1' pos[2999328]:311 vs posd[2999328-1]:1819316
    -> problems with unsorted saf file chromoname: 'VZRF01002078.1' pos[1168751]:10 vs posd[1168751-1]:1627490
    -> problems with unsorted saf file chromoname: 'VZRF01002078.1' pos[2337502]:10 vs posd[2337502-1]:1627490
    -> problems with unsorted saf file chromoname: 'VZRF01002081.1' pos[1255939]:121 vs posd[1255939-1]:1731002
    -> problems with unsorted saf file chromoname: 'VZRF01002081.1' pos[2511878]:121 vs posd[2511878-1]:1731002
    -> problems with unsorted saf file chromoname: 'VZRF01002083.1' pos[1448834]:45 vs posd[1448834-1]:1772714
    -> problems with unsorted saf file chromoname: 'VZRF01002083.1' pos[2897668]:45 vs posd[2897668-1]:1772714
    -> problems with unsorted saf file chromoname: 'VZRF01002164.1' pos[3001152]:70 vs posd[3001152-1]:3986750
    -> problems with unsorted saf file chromoname: 'VZRF01002164.1' pos[6002304]:70 vs posd[6002304-1]:3986750
    -> problems with unsorted saf file chromoname: 'VZRF01002170.1' pos[1337913]:14 vs posd[1337913-1]:1635394
    -> problems with unsorted saf file chromoname: 'VZRF01002170.1' pos[2675826]:14 vs posd[2675826-1]:1635394
    -> problems with unsorted saf file chromoname: 'VZRF01002270.1' pos[1572278]:3 vs posd[1572278-1]:1770276
    -> problems with unsorted saf file chromoname: 'VZRF01002270.1' pos[3144556]:3 vs posd[3144556-1]:1770276
    -> problems with unsorted saf file chromoname: 'VZRF01003085.1' pos[1873597]:935 vs posd[1873597-1]:2342476
    -> problems with unsorted saf file chromoname: 'VZRF01003085.1' pos[3747194]:935 vs posd[3747194-1]:2342476
    -> problems with unsorted saf file chromoname: 'VZRF01003198.1' pos[1781647]:47 vs posd[1781647-1]:2204817
    -> problems with unsorted saf file chromoname: 'VZRF01003198.1' pos[3563294]:47 vs posd[3563294-1]:2204817
    -> problems with unsorted saf file chromoname: 'VZRF01003225.1' pos[1424298]:37 vs posd[1424298-1]:1837532
    -> problems with unsorted saf file chromoname: 'VZRF01003225.1' pos[2848596]:37 vs posd[2848596-1]:1837532
    -> problems with unsorted saf file chromoname: 'VZRF01003403.1' pos[1750153]:161 vs posd[1750153-1]:2304690
    -> problems with unsorted saf file chromoname: 'VZRF01003403.1' pos[3500306]:161 vs posd[3500306-1]:2304690
    -> problems with unsorted saf file chromoname: 'VZRF01003559.1' pos[1591756]:65 vs posd[1591756-1]:2090553
    -> problems with unsorted saf file chromoname: 'VZRF01003559.1' pos[3183512]:65 vs posd[3183512-1]:2090553
    -> problems with unsorted saf file chromoname: 'VZRF01004612.1' pos[1150017]:8 vs posd[1150017-1]:1684003
    -> problems with unsorted saf file chromoname: 'VZRF01004612.1' pos[2300034]:8 vs posd[2300034-1]:1684003
    -> problems with unsorted saf file chromoname: 'VZRF01005175.1' pos[1475906]:6433 vs posd[1475906-1]:1754798
    -> problems with unsorted saf file chromoname: 'VZRF01005175.1' pos[2951812]:6433 vs posd[2951812-1]:1754798
    -> problems with unsorted saf file chromoname: 'VZRF01005414.1' pos[1216044]:215 vs posd[1216044-1]:1615746
    -> problems with unsorted saf file chromoname: 'VZRF01005414.1' pos[2432088]:215 vs posd[2432088-1]:1615746
    -> problems with unsorted saf file chromoname: 'VZRF01005562.1' pos[2384210]:1294 vs posd[2384210-1]:2875881
    -> problems with unsorted saf file chromoname: 'VZRF01005562.1' pos[4768420]:1294 vs posd[4768420-1]:2875881
    -> problems with unsorted saf file chromoname: 'VZRF01005657.1' pos[1572672]:426 vs posd[1572672-1]:2055607
    -> problems with unsorted saf file chromoname: 'VZRF01005657.1' pos[3145344]:426 vs posd[3145344-1]:2055607
    -> problems with unsorted saf file chromoname: 'VZRF01005754.1' pos[1419380]:145 vs posd[1419380-1]:1771275
    -> problems with unsorted saf file chromoname: 'VZRF01005754.1' pos[2838760]:145 vs posd[2838760-1]:1771275
    -> problems with unsorted saf file chromoname: 'VZRF01006295.1' pos[1595018]:201 vs posd[1595018-1]:2066701
    -> problems with unsorted saf file chromoname: 'VZRF01006295.1' pos[3190036]:201 vs posd[3190036-1]:2066701
    -> problems with unsorted saf file chromoname: 'VZRF01006584.1' pos[1307392]:8 vs posd[1307392-1]:1628002
    -> problems with unsorted saf file chromoname: 'VZRF01006584.1' pos[2614784]:8 vs posd[2614784-1]:1628002
    -> problems with unsorted saf file chromoname: 'VZRF01006704.1' pos[1312512]:99 vs posd[1312512-1]:1802706
    -> problems with unsorted saf file chromoname: 'VZRF01006704.1' pos[2625024]:99 vs posd[2625024-1]:1802706
    -> problems with unsorted saf file chromoname: 'VZRF01006819.1' pos[1766192]:128 vs posd[1766192-1]:2791916
    -> problems with unsorted saf file chromoname: 'VZRF01006819.1' pos[3532384]:128 vs posd[3532384-1]:2791916
    -> problems with unsorted saf file chromoname: 'VZRF01006837.1' pos[1645839]:240 vs posd[1645839-1]:2190617
    -> problems with unsorted saf file chromoname: 'VZRF01006837.1' pos[3291678]:240 vs posd[3291678-1]:2190617
    -> problems with unsorted saf file chromoname: 'VZRF01007136.1' pos[1587237]:35 vs posd[1587237-1]:1956362
    -> problems with unsorted saf file chromoname: 'VZRF01007136.1' pos[3174474]:35 vs posd[3174474-1]:1956362
    -> problems with unsorted saf file chromoname: 'VZRF01007591.1' pos[1554923]:35 vs posd[1554923-1]:2101834
    -> problems with unsorted saf file chromoname: 'VZRF01007591.1' pos[3109846]:35 vs posd[3109846-1]:2101834

These problem sites are all near the ends of the chromosomes, possibly the last SNP on each chromosome.

Here is my ANGSD version:

-> angsd version: 0.935 (htslib: 1.11) build(Apr 19 2021 15:46:47)

Finally, in case it's helpful, here is my rf file. It includes the first and last bases in a number of scaffolds.

VZRF01000187.1 1 2229693
VZRF01000586.1 1 1920239
VZRF01000927.1 1 2036130
VZRF01001110.1 1 3829561
VZRF01001586.1 1 2702612
VZRF01001720.1 1 2793601
VZRF01001859.1 1 2234040
VZRF01001867.1 1 1819343
VZRF01002078.1 1 1627592
VZRF01002081.1 1 1731244
VZRF01002083.1 1 1772795
VZRF01002164.1 1 3986751
VZRF01002170.1 1 1635836
VZRF01002270.1 1 1770293
VZRF01003085.1 1 2342532
VZRF01003198.1 1 2205074
VZRF01003225.1 1 1837724
VZRF01003403.1 1 2304910
VZRF01003559.1 1 2090620
VZRF01004612.1 1 1684394
VZRF01005175.1 1 1754815
VZRF01005414.1 1 1615767
VZRF01005562.1 1 2875882
VZRF01005657.1 1 2055830
VZRF01005754.1 1 1771318
VZRF01006295.1 1 2066764
VZRF01006584.1 1 1628026
VZRF01006704.1 1 1802776
VZRF01006819.1 1 2792923
VZRF01006837.1 1 2190622
VZRF01007136.1 1 1957350
VZRF01007591.1 1 2101985

Let me know what I might be able to try to fix this, thanks!

-Teresa