Open Yoko-Hira opened 7 years ago
Hi, I got the same error as you did, did you solve the problem finally?
Also, got this problem. Any idea?
Hi, I got the same problem. Is there anybody who solved it?
Hello, I also got the same problem. Who had solved it?
Hello,
I was able to make it work by changing the parameter for -l and -M. I hope this helps!
From: yingzhang28 notifications@github.com Sent: Saturday, October 10, 2020 12:56 AM To: thlee/SNPhylo SNPhylo@noreply.github.com Cc: Yoko Eck yoko.eck@ucr.edu; Author author@noreply.github.com Subject: Re: [thlee/SNPhylo] Error: The length of sequence is too long (> 50000 bp) to construct a tree! (#23)
Hello, I also got the same problem. Who had solved it?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/thlee/SNPhylo/issues/23#issuecomment-706507448, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHZELEQRRJWZ3P5JUOXQ6ZLSKAHS5ANCNFSM4D6MQWDQ.
Hello!
I was able to make it work by changing the parameter for -l and -M. For example, I had -l 0.2 -m 0.1 -M 0.1 and this did not work. so I changed it to -l 0.7 -m 0.0 -M 0.02 and this worked perfectly for me,
I hope this helps!
Does everyone have the best solution about this problem?
thanks
Hi, I also got the same problem. I calculated the longest line in fasta, and modified the line 267 and 269, and solved this issues.
But a new error about MUSCLE occured: line 281: 29688 killed "${MUSCLE}" -phyi -in "${prefix_output}.fasta" -out "${prefix_output}.phylip.txt" Does anybody also get this problem? Thanks!
Hello
There could be various possible causes. However, the content in the following link might be helpful:
https://github.com/thlee/SNPhylo/issues/52#issuecomment-1973127123
bash snphylo.sh -H FINAL-ADMIXTURE-CVC880-MandSatsumaClem-CVC97-TASSEL-Samplename.hapmap.txt -b -a 9 -l 0.2 -m 0.1 -M 0.1 -P CVC97 Start to remove low quality data.
91 low quality lines were removed
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2) Start HapMap2GDS ... Scanning ... file: CVC97.filtered.hapmap content: 51206 rows x 108 columns Mon Oct 9 15:01:17 2017 store sample id, snp id, position, and chromosome. start writing: 97 samples, 51205 SNPs ... file: CVC97.filtered.hapmap Mon Oct 9 15:02:13 2017 Done. Hint: it is suggested to call
snpgdsOpen' to open a SNP GDS file instead of
openfn.gds'. SNP pruning based on LD: Excluding 0 SNP on non-autosomes Excluding 51,205 SNPs (monomorphic: TRUE, MAF: 0.1, missing rate: 0.1) Working space: 97 samples, 0 SNP using 1 (CPU) core sliding window: 500,000 basepairs, Inf SNPs |LD| threshold: 0.2 method: composite 0 markers are selected in total. Determine phylogenetic tree based on SNP data with a VCF, a HapMap, a Simple SNP or a GDS fileVersion: 20140701
Usage: snphylo.sh -v VCF_file [-p Maximum_PLCS (5)] [-c Minimum_depth_of_coverage (5)]|-H HapMap_file [-p Maximum_PNSS (5)]|-s Simple_SNP_file [-p Maximum_PNSS (5)]|-d GDS_file [-l LD_threshold (0.1)] [-m MAF_threshold (0.1)] [-M Missing_rate (0.1)] [-o Outgroup_sample_name] [-P Prefix_of_output_files (snphylo.output)] [-b [-B The_number_of_bootstrap_samples (100)]] [-a The_number_of_the_last_autosome (22)] [-r] [-A] [-h]
Options: -A: Perform multiple alignment by MUSCLE -b: Perform (non-parametric) bootstrap analysis and generate a tree -h: Show help and exit -r: Skip the step removing low quality data (-p and -c option are ignored).
Acronyms: PLCS: The percent of Low Coverage Sample PNSS: The percent of Sample which has no SNP information LD: Linkage Disequilibrium MAF: Minor Allele Frequency
Simple SNP File Format:
Chrom Pos SampleID1 SampleID2 SampleID3 ...
Error: The length of sequence is too long (> 50000 bp) to construct a tree! Please restart this script with different parameter values (-l, -m and/or -M).
I have tried all sorts of combination for parameters values for -l, -m and -M however, I kept getting the same error. Is there anything else I should be doing?