faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
78 stars 49 forks source link

Missing information for all loci using internal trimming/gblocks? #129

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hi I am trying to test my data set using both edge and internal trimming following the Phyluce tutorial, while edge trimming seems to have worked fine, the phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed script is giving me an error: WARNING - Missing information for locus uce-#whatever for all the loci, resulting in an empty folder. I checked the previous stem and all the loci seems to have been formatted into fasta like the tutorial.

Thanks,

Miles

brantfaircloth commented 6 years ago

Did you output the trimmed alignments in FASTA format? it's needed by gblocks (rather than nexus, which is the default).

ghost commented 6 years ago

Yes they are all in the mafft-nexus-internal-trimmed folder, named as uce-#.fasta. I also checked and made sure they are indeed fasta file format.

brantfaircloth commented 6 years ago

Take one locus and try to run gblocks against it - that may indicate what they problem is.

ghost commented 6 years ago

It seems to be giving the same error:

phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed --alignments mafft-nexus-internal-trimmed --output mafft-nexus-internal-trimmed-gblocks --cores 16 --output-format fasta --log-path log 2018-10-01 12:43:55,309 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Starting phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed 2018-10-01 12:43:55,310 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Version: git fatal: Not a git repository: '/apps/phyluce/20180727/lib/python2.7/site-packages/.git' 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --alignments: /ufrc/lucky/yuanmeng.zhang/Nylanderia/taxon-sets/all/mafft-nexus-internal-trimmed 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --b1: 0.5 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --b2: 0.85 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --b3: 8 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --b4: 10 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --cores: 16 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --input_format: fasta 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --log_path: /ufrc/lucky/yuanmeng.zhang/Nylanderia/taxon-sets/all/log 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --output: /ufrc/lucky/yuanmeng.zhang/Nylanderia/taxon-sets/all/mafft-nexus-internal-trimmed-gblocks 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --output_format: fasta 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Argument --verbosity: INFO 2018-10-01 12:43:55,311 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Starting phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed 2018-10-01 12:43:55,312 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Getting aligned sequences for trimming 2018-10-01 12:43:55,343 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Alignment trimming begins.

2018-10-01 12:44:02,426 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Alignment trimming ends 2018-10-01 12:44:02,426 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - INFO - Writing output files 2018-10-01 12:44:02,426 - phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed - WARNING - Missing information for locus uce-10526 INFO - Completed phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed

brantfaircloth commented 6 years ago

sorry - that was not clear. I meant using gblocks directly. so that would be something like running: gblocks test/uce-1004.fasta -t DNA -b1=0.5 -b2=0.85 -b3=8 -b4=8 -b5=h -p=n. Alternatively, attach a single alignment file, and i'll have a look.

ghost commented 6 years ago

Here is what I got using your code:

gblocks mafft-nexus-internal-trimmed/uce-10.fasta -t DNA -b1=0.5 -b2=0.85 -b3=8 -b4=8 -b5=h -p=n

74 sequences and 1608 positions in the first alignment file: mafft-nexus-internal-trimmed/uce-10.fasta

WARNING: Parameter -b1 not properly entered. The minimum number of sequences for a conserved position must be bigger than or equal to 38 (half the number of sequences + 1)

WARNING: minimum number of sequences for a flank position set to the minimum possible value. mafft-nexus-internal-trimmed/uce-10.fasta Original alignment: 1608 positions Gblocks alignment: 510 positions (31 %) in 21 selected block(s)

And the attached alignment file, had to change from fasta to txt format in order to upload uce-10.txt

brantfaircloth commented 6 years ago

I can't say what's happening. I just took the alignment you attached, put that in a folder, renamed it uce-10.fasta and ran:

python phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed \
    --alignments aligns \
    --output test-out \
    --input-format fasta \
    --output-format nexus

This ran and produced output, as expected. That output it attached, as TXT: uce-10.txt

ghost commented 6 years ago

Mmm that is odd, okay thank you I will play around and report back.

moskalenko commented 6 years ago

I support application installs in the environment @ymilesz is likely running in. I've traced the execution of phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed and noticed that

a) trimmomatic was not installed by the conda install, so illumiprocessor was failing the trimmomatic test. b) phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed tried to run Gblocks instead of gblocks. It looks like $CONDA/config/phyluce.conf lists Gblocks and a smattering of other files might reference the name as well. Once either a Gblocks symlink is created or config/phyluce.conf is changed to use 'gblocks' a uce-10.nexus output file that matches what you posted is created with no errors.

conda installed phyluce=1.6.7

That appears to be the latest version.

Cheers,

Alex

brantfaircloth commented 6 years ago

Hi Alex,

thanks for the feedback.

a) i had assumed that trimmomatic was installed... in fact, it should be installed by following the dependency chain of phyluce (phyluce requires illumiprocessor: https://github.com/bioconda/bioconda-recipes/blob/master/recipes/phyluce/meta.yaml and illumiprocessor requires trimmomatic: https://github.com/bioconda/bioconda-recipes/blob/master/recipes/illumiprocessor/meta.yaml)

b) what was once gblocks was updated in bioconda to the name Gblocks, which I'm guessing is the source of the issue. I'm also guessing that this change did not make it over to miles' install for some reason or another.

I wonder if what's happened is that this is an older install of phyluce that did not successfully upgrade to follow the bioconda dependencies?

moskalenko commented 6 years ago

This was a new clean install of phyluce=1.7.6 within a new conda environment as far as I know.

brantfaircloth commented 6 years ago

that’s extra weird - I just did a clean install and it pulled down everything. do you mind sharing the content of ~/.condarc?

moskalenko commented 6 years ago

The person who did the install didn't have a .condarc. I'm going to try installing phyluce=1.7.6 with my .condarc, which contains the following channels:

moskalenko commented 6 years ago

Alright, I just created a fresh conda environment and installed phyluce=1.7.6 inside with my .condarc. It looks like both trimmomatic and Gblocks got installed. Gblocks runs as indicated by the presence of an up-to-date uce-10.fasta-gb. However, the final uce-10.nexus nexus file is not produced. I see gblocks:$CONDA/bin/gblocks in the default config/phyluce.conf

Once I change the phyluce.conf or add a gblocks symlink I get a nexus file.

grep -i gblocks $(which phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed) description="""Use GBLOCKS to trim existing alignments in parallel""", help="""The GBLOCKS -b1 proportion.""", help="""The GBLOCKS -b2 proportion.""", help="""The GBLOCKS -b3 integer value.""", help="""The GBLOCKS -b4 integer value.""", get_user_path("binaries","gblocks"), def write_gblocks_alignments_to_outdir(log, outdir, alignments, format): write_gblocks_alignments_to_outdir(log, args.output, alignments, args.output_format)

brantfaircloth commented 6 years ago

Something strange is still going on and I can't quite put my finger on what it is. I've got a clean install of the same 1.6.7 version (tested on both linux and macOS), and the path to Gblocks is specified in the file:

[other stuff in path]/conda/envs/test_phyluce/config/phyluce.conf

That path is correctly listed as gblocks:$CONDA/bin/Gblocks. The code you show (2nd to last line) is running the get_user_path function to grab and expand this path to Gblocks using the the key (gblocks) and value ($CONDA/bin/Gblocks) in phyluce.conf. That is working correctly in both of my clean installs.

Is there a $USER ~/.phyluce.conf by any chance? If there is, values in that file take precedence over the defaults that come with any particular version (e.g. those values in [other stuff in path]/conda/envs/test_phyluce/config/phyluce.conf). Perhaps that explains the issue... otherwise, I'm baffled.

moskalenko commented 6 years ago

No ~/.phyluce.conf.

Default phyluce.conf has bin/gblocks instead of Gblocks as far as I see.

$ grep gblocks 1.7.6/config/phyluce.conf gblocks:$CONDA/bin/gblocks

The install we have in use is fine now and I don't mind putting a check into a README for future installs, so this probably can be closed.

Thanks for the discussion!

Alex

mchj74 commented 5 years ago

Hi. phyluce_align_get_gblocks_trimmed_alignments_from_untrimmed \ --alignments mafft-nexus-internal-trimmed \
--output mafft-nexus-internal-trimmed-gblocks \
--cores 1 \
--log log runs and successfully gets completed, creates a directory called mafft-nexus-internal-trimmed-gblocks which is empty. could you give a hint please why no file is created? Thanks

brantfaircloth commented 5 years ago

Assuming you installed phyluce using conda (and did this recently), the error may be that you need to pass the correct flag for the type of alignment you are inputting. However, it's very hard to tell without additional information and/or some test data.

wbsimey commented 5 years ago

I ran into same issue. Once I edited ~/anaconda2/config/phyluce.conf from gblocks:$CONDA/bin/Gblocks to gblocks:$CONDA/bin/gblocks it ran fine.

brantfaircloth commented 5 years ago

i think this issue may arise from upgrading phyluce, but not upgrading gblocks (this happens by default, unfortunately).