Open teagerv opened 2 years ago
I'm having this same issue on Python 3.9.13. Have there been any updates?
Solution: I figured it out, you have to make a file with the NCBI ids that you want to include if you're subsetting taxa, or it won't populate with any sequences (this is described in the 'Runs' doc). Don't know why I decided that wasn't relevant last time I looked at this...
There is a helper script if you already have a file with all the names, but I just used a quick BioPython script to pull them and it's running now:
from Bio import Entrez
def main():
Entrez.email = ""
db_type = 'nucleotide'
search_terms = '(Architaenioglossa[Orgn])'
output_file = '/home/snail/Desktop/architaenioglossa_taxalist.txt'
returned_ids = esearch(search_terms, db_type)
make_taxalist(returned_ids, output_file)
return
def esearch(search_terms, db_type):
handle = Entrez.esearch(db=db_type, term = search_terms, idtype="acc", retmax = )
record = Entrez.read(handle)
print('Search returned %s results.\n' %record["Count"])
ids = record["IdList"]
return ids
def make_taxalist(ids, output):
with open(output, 'a') as fh:
for i in ids:
fh.write(f'{i}\n')
return
if __name__ == '__main__':
main()
Just set your search terms to the subset you want, set retmax to at least the number of taxa, and put in a random email (not sure if this is required).
Hi, I have the same problems! And I have provided the taxalist, still does work! Does anyone can help? Thanks! The code and results are shown here:
yang@bdchxy-PowerEdge-M630-VRTX:~$ python application/PyPHLAWD-master/src/setup_clade_ap.py -t Fagales -b /storage/phlawd_db_maker-master/DB/pln.db -s /storage/phlawd_db_maker-master/DB -o application/PyPHLAWD-master/examples/clustered/ -l application/PyPHLAWD-master/examples/clustered/ -f ncbi_sp_ids_938.txt
STARTING PYPHLAWD (⌯꒪͒ ꌂ̇ ꒪͒)
LIMITING TO TAXA IN ncbi_sp_ids_938.txt
MAKING TREE Fagales (✧ ꒪◞౪◟꒪)
MAKING DIRS IN application/PyPHLAWD-master/examples/clustered ヾ(≧∪≦)ノ〃
PROBLEM CREATING application/PyPHLAWD-master/examples/clustered/Fagales_3502 (゜´Д`゜)
POPULATING DIRS application/PyPHLAWD-master/examples/clustered ₊·◟(˶╹̆ꇴ╹̆˵)◜‧・
Traceback (most recent call last):
File "/home/yang/application/PyPHLAWD-master/src/populate_dirs_first.py", line 47, in
Hi and a happy new year,
I'm experiencing the same issue, any help would be highly appreciated?!
It would also be nice if the website (https://fephyfofum.github.io/PyPHLAWD/) could be updated as there is no more setup_clade.py
(which is now called setup_clade_ap.py)
.
Cheers Bastian
Hi bheimubu! Happy new year!
For this question " I'm experiencing the same issue, any help would be highly appreciated?! It would also be nice if the website (https://fephyfofum.github.io/PyPHLAWD/) could be updated as there is no more setup_clade.py
(which is now called setup_clade_ap.py)
.", mine works with the old version PyPhlawd. Therefore, if you have an old version, you could try. The new version doesn't work well this time. Good luck!
Yingyya
Hi @YingyingYang2019,
you make my day, it's working with the old version (downloaded as source code from here).
Cheers Bastian
Hi. I would just like to add that I was having the same trouble. If there is anything you figure out, please keep me updated. I also couldn't understand how to have the genus & sequence for this. If that is possible, please let me know. The code is here, in which I am running trouble in:
python3 setup_clade_ap.py -t Laurales -b /Users/administrator_ge/Desktop/pln.db -s /Users/administrator_ge/Desktop/seq -o /Users/administrator_ge/Desktop/output -l /Users/administrator_ge/Desktop/logfile.md.gz -f /Users/administrator_ge/Desktop/taxalist.txt
STARTING PYPHLAWD ٩(⚙ȏ⚙)۶
LIMITING TO TAXA IN /Users/administrator_ge/Desktop/taxalist.txt
MAKING TREE Laurales ╰(✧∇✧)╯
MAKING DIRS IN /Users/administrator_ge/Desktop/output Σ(ノ°▽°)ノ
PROBLEM CREATING /Users/administrator_ge/Desktop/output/Laurales_3432 (;へ:)
POPULATING DIRS /Users/administrator_ge/Desktop/output Σ(*ノ´>ω<。`)ノ
Traceback (most recent call last):
File "/Users/administrator_ge/apps/PyPHLAWD/src/populate_dirs_first.py", line 47, in
IndexError: list index out of range
PYPHLAWD DONE ୧༼✿ ͡◕ д ◕͡ ༽୨
Total time (H:M:S): 0:01:01.869942 ヽ(^o^)丿
(⌐■_■)
Question Where is the -s parameter (SEQGZFOLDER) for setup_clade_ap.py meant to point?
Issue: I seem to be having a problem populating the gzip directory with sequences. The .table file is all populated from the ncbi db, but it's not finding the sequences. I'm not sure where the -s parameter is supposed to be pointing maybe? ~/ is where all the compressed ncbi files are from phlawd_db_maker.
Steps taken: Followed the steps on the Install page. Built phlawd_db_maker and all dependencies without errors. Built the database with phlawd_db_maker with no errors. Followed directions on the Runs page for a clustering analysis. Python version is 3.8.10
I know Python pretty well, so if I find a fix I'll make a pull request.