Closed malivezey closed 9 years ago
are your input files in the Myco folder named in this format: genus_species1.fasta, genus_species2.fasta etc.?
Tobias, No sir. They are fasta files, yes. But not with the filename incremented as advised above. The .tar.gz files are also in the same folder. I thought it would read the first file with a fasta extension, and move to the 2nd. I will give this a try. Just a little coaching like this will be very helpful. Thanks very much for responding. I have great hopes for these programs even if the initial response of my instructors has been muted.
Thanks very much! Martin Livezey
On Tue, Mar 24, 2015 at 10:26 AM, Tobias Hofmann notifications@github.com wrote:
are your input files in the Myco folder named in this format: genus_species1.fasta, genus_species2.fasta etc.?
— Reply to this email directly or view it on GitHub https://github.com/faircloth-lab/phyluce/issues/24#issuecomment-85522110 .
I find the same result regardless of what my fasta files are named. Here is Myco directory:
ml@ml-laptop:~/Myco$ ls -1 Antsi1_AssemblyScaffolds_Repeatmasked.fasta Aspfl1_AssemblyScaffolds_Repeatmasked.fasta Beaba1_AssemblyScaffolds_Repeatmasked.fasta Clagr3_AssemblyScaffolds_Repeatmasked.fasta Cormi1_AssemblyScaffolds_Repeatmasked.fasta Exigl1_AssemblyScaffolds_Repeatmasked.fasta Fompi3_AssemblyScaffolds_Repeatmasked.fasta genus_species1.fasta genus_species2.fasta genus_species3.fasta gz HypCI4A_1_AssemblyScaffolds_Repeatmasked.fasta Morco1_AssemblyScaffolds_Repeatmasked.fasta Phchr2_AssemblyScaffolds_Repeatmasked.fasta Puccinia_graminis.masked.fasta Ramac1_AssemblyScaffolds_Repeatmasked.fasta Tapde1_1_AssemblyScaffolds_Repeatmasked.fasta
Maybe I am using the match contigs command in the wrong way.
ml@ml-laptop:~$ match_contigs_to_probes.py Myco uce-5k-probes.fasta output log
Traceback (most recent call last):
File "/home/ml/anaconda/bin/match_contigs_to_probes.py", line 18, in
produces the same error as:
ml@ml-laptop:~$ match_contigs_to_probes.py
Traceback (most recent call last):
File "/home/ml/anaconda/bin/match_contigs_to_probes.py", line 18, in
Is my setup wrong or is repository out-of-date?
There is something wrong with your command, I'm sorry, I didn't pay to much attention to that before. Try this command (in your specific case):
match_contigs_to_probes.py --contigs Myco/ --probes uce-2.5k-probes.fasta --output output/ --log-path log/
I would try to clean the input folder up and only keep the target files in there (the fasta containing the contigs, nothing else). I wrote the whole workflow in one document to get a better overview and I hope I'm explaining the steps well enough to follow:
https://github.com/tobiashofmann88/NGS/wiki/UCE-tutorial
but also check documentation on the phyluce github, it covers many more options.
Thank you again Tobias. By the time you wrote, we have figured out the mistake in the command. It had the same result. We discovered that the program was not able to call phyluce! So I either have a path problem or somehow I have not installed the repository correctly, even though I have run update and upgrade and everything seemed to be fine. So I and my new partner, who has more experience with python and Linux will continue to experiment. We know you have put a lot into it.
Since I have your attention, I would like to share my idea to see if you think it is scientifically valid. I want to use UCE's to test or provide support and make phylogenetic trees of fungi. The standard PCR approach that relies on ITS, LSU, SSU, TEF1, and others works pretty well, especially in practical terms. But larger taxonomic issues remain and that seem to pervade mycology in general. My hope is that using say 10,000 base pairs of regions adjacent to UCEs will give a better picture. I am planning to use assembled sequences from the JGI 1000 fungal genomes website /MycoCosm.
http://genome.jgi-psf.org/programs/fungi/index.jsf
Maybe we can develop a new probe set based on the 5k one you have that responsibly spans yeast to a fancy mushroom. (Sarrchomycotina to Agaricomycotina).
What do you think?
I am doing this as a part of a class, here is the description of the course from the catalog:
BIOF 521 Spring, 3 credits Bioinformatics for Analysis of Data Generated by Next Generation Sequencing Ben Busby Sijung Yun* In this course, students will learn to analyze next generation sequencing data, particularly for DNA-seq, RNA-seq, CHIP-seq, and DNAmethylation. The course will be divided between lectures and hands-on sessions. Lectures will cover background knowledge and survey various software programs. For hands-on sessions, we will primarily focus on the use of the Galaxy platform for analysis of raw data obtained from an Illumina’s HiSeq-2000 and data available in the NCBI-SRA. Use of distributed and abstracted computing, such as Biowulf and cloud computing will also be covered. There will be a term project in which students will design projects relevant to their research. Learning objectives: n Learn to analyze Next Generation Sequencing data including DNA-seq, RNA-seq, and CHIP-seq in Graphical User Interface using Galaxy or in command line n Write short scripts to do this analysis using command line resources. Prerequisites: students will be expected to bring their own laptop to each session.
Here is a link to the school:
Here is why I am taking it: for the love of mushrooms!
http://mushroomobserver.org/observer/observations_by_user/2584
Thanks again very much for your time. Martin
Hi Tobias, Please don't think I am a lost cause. I decided to test the program lines of match_contigs_to_probes.py first line by line, then in a bash script shown below. Result seem to indicate that my path is not setup right or the repository or program or some dependency is out-of-date or improperly updated, for instance argparse. Response to bash script is after " --log-path log" below.
sudo apt-get install re find re find os find sys find glob find copy sudo apt-get install sqlite3 find sqlite3 import argparse sudo apt-get install argparse find argparse find phyluce from phyluce import lastz find lastz find phyluce.helpers from phyluce.helpers import is_dir, is_file, FullPaths find is_dir find is_file find FullPaths find phyluce.log from phyluce.log import setup_logging find setup_logging find collections from collections import defaultdict find defaultdict find Bio from Bio import SeqIO find SeqIO find pdb import pdb match_contigs_to_probes.py \ --contigs /Myco \ --probes uce-5k-probes.fasta \ --output /output \ --log-path log
ml@ml-laptop:~$ bash rmcont.foo
[sudo] password for ml:
Reading package lists... Done
Building dependency tree
Reading state information... Done
re is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded.
re
os
sys
glob
copy
Reading package lists... Done
Building dependency tree
Reading state information... Done
sqlite3 is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded.
sqlite3
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package argparse
argparse
find: phyluce': No such file or directory from: can't read /var/mail/phyluce find:
lastz': No such file or directory
find: phyluce.helpers': No such file or directory from: can't read /var/mail/phyluce.helpers find:
is_dir': No such file or directory
find: is_file': No such file or directory find:
FullPaths': No such file or directory
find: phyluce.log': No such file or directory from: can't read /var/mail/phyluce.log find:
setup_logging': No such file or directory
find: collections': No such file or directory from: can't read /var/mail/collections find:
defaultdict': No such file or directory
find: Bio': No such file or directory from: can't read /var/mail/Bio find:
SeqIO': No such file or directory
find: pdb': No such file or directory import.im6: unable to grab mouse
': Resource temporarily unavailable @ error/xwindow.c/XSelectWindow/9047.
Traceback (most recent call last):
File "/home/ml/anaconda/bin/match_contigs_to_probes.py", line 18, in
Tobias, I believe I am following your instructions, this is what I get:
here is my working folder: ml@ml-laptop:~/contigs$ ls -l total 63244 -rw-r--r-- 1 ml ml 11740461 Feb 15 19:58 genus_species1.fasta -rw-r--r-- 1 ml ml 41529718 Feb 16 15:12 genus_species2.fasta -rw-r--r-- 1 ml ml 10962406 Feb 16 15:12 genus_species3.fasta drwxrwxr-x 2 ml ml 4096 Mar 30 20:26 log drwxrwxr-x 2 ml ml 4096 Mar 30 20:33 mapped_uce -rw-rw-r-- 1 ml ml 511332 Mar 18 18:58 uce-2.5k-probes.fasta
here is the response of the terminal:
ml@ml-laptop:~/contigs$ match_contigs_to_probes.py --contigs contigs/ --probes uce-2.5k-probes.fasta --output mapped_uce/ --log-path log
Traceback (most recent call last):
File "/home/ml/anaconda/bin/match_contigs_to_probes.py", line 18, in
Hey Martin, I'm sorry for not responding to your previous posts, I'm quite busy at the moment. By the way, I was not part of the development of this software, I'm also a user, just like you. But after spending some time with it I managed to get it working for my data, but I know the pain of running from one issue into the next and not knowing what's wrong. But if you follow the instructions I wrote step by step it should work (I made sure to write them very detailed), assuming that everything is installed and set up correctly. I'm not exactly sure what the issue is that you are running into. One thing I could think of:
Did you change your path from something like /usr/local/bin into /usr/local/anaconda/bin ?? You do that by editing the .bashrc file in your home driectory (it's usually an invisible file). You have to change this line (or equivalent): export PATH="/usr/local/bin:/usr/local/jar:$PATH:/usr/local/bin"
into something along these lines export PATH="/usr/local/anaconda/bin:/usr/local/anaconda/jar:$PATH:/usr/local/bin"
Best, Tobi
also add this line to your .bashrc file: export CONDA_DEFAULT_ENV=/usr/local/anaconda
The link to my instruction manual has changed and is now: https://github.com/tobiashofmann88/UCE-data-management/wiki
Hey Tobi, Thanks so much for responding. I understand (and applaud that) you have other pressing issues in your life. Are you suggesting that the path be exactly the same? I am not at my linux computer at the moment, but I did modify my .bashrc to follow the form you suggest (but for another path). Also - are you suggesting that the file names match exactly or simply have exactly the same form, i.e. text1.fasta?
I have another SSD, so maybe I will clean it up and do the entire install process from the beginning. It should not take me long at this point. A friend did install KDE, so that may have gummed things up. Thanks, Martin
On Tue, Mar 31, 2015 at 5:02 AM, Tobias Hofmann notifications@github.com wrote:
Hey Martin, I'm sorry for not responding to your previous posts, I'm quite busy at the moment. By the way, I was not part of the development of this software, I'm also a user, just like you. But after spending some time with it I managed to get it working for my data, but I know the pain of running from one issue into the next and not knowing what's wrong. But if you follow the instructions I wrote step by step it should work (I made sure to write them very detailed), assuming that everything is installed and set up correctly. I'm not exactly sure what the issue is that you are running into. One thing I could think of:
Did you change your path from something like /usr/local/bin into /usr/local/anaconda/bin ?? You do that by editing the .bashrc file in your home driectory (it's usually an invisible file). You have to change this line (or equivalent): export PATH="/usr/local/bin:/usr/local/jar:$PATH:/usr/local/bin"
into something along these lines export PATH="/usr/local/anaconda/bin:/usr/local/anaconda/jar:$PATH:/usr/local/bin"
Best, Tobi
— Reply to this email directly or view it on GitHub https://github.com/faircloth-lab/phyluce/issues/24#issuecomment-88003484 .
The path does not have to be exactly the same, it depends on where the anaconda package is installed. You will have to check where anaconda is installed and set the path to the bin folder in the anaconda folder as your new working directory.
export PATH="/full/path/to/anaconda/bin:/full/path/to/anaconda/jar:$PATH:/full/path/to/user/local/bin"
Concerning the other line I suggested to add you also have to give there the correct path to anaconda: export CONDA_DEFAULT_ENV=/full/path/to/anaconda
The files only have to have the same form as in my example. What is vital is the 'underscore' in the name, so e.g. text_a1.fasta would work but not text1.fasta.
Best, Tobi
Got it. Thanks
On Tue, Mar 31, 2015 at 9:12 AM, Tobias Hofmann notifications@github.com wrote:
The path does not have to be exactly the same, it depends on where the anaconda package is installed. You will have to check where anaconda is installed and set the path to the bin folder in the anaconda folder as your new working directory.
export PATH="/full/path/to/anaconda/bin:/full/path/to/anaconda/jar:$PATH:/full/path/to/user/local/bin"
Concerning the other line I suggested to add you also have to give there the correct path to anaconda: export CONDA_DEFAULT_ENV=/full/path/to/anaconda
The files only have to have the same form as in my example. What is vital is the 'underscore' in the name, so e.g. text_a1.fasta would work but not text1.fasta.
Best, Tobi
— Reply to this email directly or view it on GitHub https://github.com/faircloth-lab/phyluce/issues/24#issuecomment-88078855 .
This issue is not closed. It has not really even been responsibly answered. Thank you Tobias for explaining the path issue. Your comments are an exact duplicate of what is in the online instructions but it does not seem to help the program at all. The program still does not work, the documentation is scant, and the support is zero.
Hi Martin,
While it is possible to re-open the issue, I assumed that your "got it" comment meant you had things working. @tobiashofmann88 has done a nice job of trying to help you get things working, and he's gone above and beyond what most people would do. Thank you @tobiashofmann88.
Please avoid comments having to do with the "responsibility" of answering a question - none of us are paid to produce software or write documentation or provide support to users. Most of us are running full research ships on top of making code available for others to modify and use for their own purposes.
As Tobias mentioned, it appears you have having $PATH problems with your installation. If you take a look at the documentation (http://phyluce.readthedocs.org/en/latest/installation.html), it outlines reasonably clearly how to get the code installed using the conda
package manager. The trickiest part is adding the anaconda
distribution to your $PATH - additional details are here:
http://phyluce.readthedocs.org/en/latest/installation.html#checking-your-path
Once you have that working, it would probably be best to get a handle on how the software works by following the tutorial, which I have recently moved into this "stable" part of the repository. That is available here:
http://phyluce.readthedocs.org/en/latest/tutorial-one.html
Best of luck with your work.
I greatly appreciate both your and Tobias time and I am willing to make a contribution to your research grant if I can get this running in a couple of days. I very strongly believe in the long term value of your approach. I cannot understand why the idea of using UCEs and associated regions has not gained more traction. To me this is something any new student to the field is praying for everyday: low hanging fruit. You have done all the work. It simply needs to be applied now to an area that needs it very much. I believe it can make a big difference to issues of phylogeny in fungi, and please note, when I am successful, I have every intention of giving you both all of the credit you deserve. I am new, particularly to linux, so if it is coming across as dense, just pitch me another bone. I have been hacking for at least two months on this and I don't think I have raised much of a fuss.
Hello Martin,
I wonder if you have figure out the issue? I got exactly the same error message as yours. Thanks!
Kai
hi Kai,
try installing git for your distribution to see if that fixes the error for the 1.5 version.
cheers, b
here is my directory structure: ml@ml-laptop:~$ ls anaconda Desktop Downloads log Music output Public uce-2.5k-probes.fasta anaconda3 Documents examples.desktop log-path Myco Pictures Templates Videos
here is my command: (first path is not needed!) ml@ml-laptop:~$ /home/ml/anaconda/bin/match_contigs_to_probes.py Myco uce-5k-probes.fasta output log
here is the error message: Traceback (most recent call last): File "/home/ml/anaconda/bin/match_contigs_to_probes.py", line 18, in
from phyluce import lastz
File "/home/ml/anaconda/lib/python2.7/site-packages/phyluce/init.py", line 16, in
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
File "/home/ml/anaconda/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/home/ml/anaconda/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
What is going wrong?
The folder 'Myco' contains assembled masked sequences from 1000 fungal genomes