wwood / kingfisher-download

Easier download/extract of FASTA/Q read data and metadata from the ENA, NCBI, AWS or GCP.
https://wwood.github.io/kingfisher-download
GNU General Public License v3.0
246 stars 38 forks source link

Does not work with python3.9 #12

Closed harper357 closed 3 years ago

harper357 commented 3 years ago

In python3.9, getchildren() was removed. This causes the following error when trying the annotate function

$ kingfisher annotate -r ERR1739691 --debug
08/10/2021 05:45:29 PM INFO: Kingfisher v0.0.1-dev
08/10/2021 05:45:29 PM INFO: Querying NCBI esearch for 1 distinct accessions e.g. ERR1739691
08/10/2021 05:45:29 PM DEBUG: Starting new HTTPS connection (1): eutils.ncbi.nlm.nih.gov:443
08/10/2021 05:45:29 PM DEBUG: https://eutils.ncbi.nlm.nih.gov:443 "GET /entrez/eutils/esearch.fcgi?db=sra&term=ERR1739691%5Baccn%5D&tool=kingfisher&email=kingfisher%40github.com&retmax=1000 HTTP/1.1" 200 None
Traceback (most recent call last):
  File "/opt/homebrew/bin/kingfisher", line 275, in <module>
    main()
  File "/opt/homebrew/bin/kingfisher", line 261, in main
    kingfisher.annotate(
  File "/opt/homebrew/Cellar/kingfisher-download/0.0.1-dev/bin/../kingfisher/__init__.py", line 438, in annotate
    metadata = SraMetadata().efetch_sra_from_accessions(run_identifiers)
  File "/opt/homebrew/Cellar/kingfisher-download/0.0.1-dev/bin/../kingfisher/sra_metadata.py", line 132, in efetch_sra_from_accessions
    ids = list(set([c.text for c in id_list_node.getchildren()]))
AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'getchildren'

see: https://docs.python.org/3.8/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.getchildren

wwood commented 3 years ago

Thanks for the well written bug report. Only, I wasn't able to reproduce:

❯ bin/kingfisher annotate -r ERR1739691 --debug
08/11/2021 01:54:07 AM INFO: Kingfisher v0.0.1-dev
08/11/2021 01:54:07 AM INFO: Querying NCBI esearch for 1 distinct accessions e.g. ERR1739691
08/11/2021 01:54:07 AM DEBUG: Starting new HTTPS connection (1): eutils.ncbi.nlm.nih.gov:443
08/11/2021 01:54:09 AM DEBUG: https://eutils.ncbi.nlm.nih.gov:443 "GET /entrez/eutils/esearch.fcgi?db=sra&term=ERR1739691%5Baccn%5D&tool=kingfisher&email=kingfisher%40github.com&retmax=1000 HTTP/1.1" 200 None
08/11/2021 01:54:09 AM INFO: Querying NCBI efetch for 1 distinct IDs e.g. 4165047
08/11/2021 01:54:09 AM DEBUG: Running efetch for IDs with request term: 4165047
08/11/2021 01:54:09 AM DEBUG: Starting new HTTPS connection (1): eutils.ncbi.nlm.nih.gov:443
08/11/2021 01:54:10 AM DEBUG: https://eutils.ncbi.nlm.nih.gov:443 "GET /entrez/eutils/efetch.fcgi?db=sra&id=4165047&tool=kingfisher&email=kingfisher%40github.com&rettype=runinfo&retmode=text HTTP/1.1" 200 None
Run        | SRAStudy  | Gbp   | LibraryStrategy | LibrarySelection | Model               | SampleName   | ScientificName
---------- | --------- | ----- | --------------- | ---------------- | ------------------- | ------------ | --------------
ERR1739691 | ERP017539 | 2.382 | WGS             | RANDOM           | Illumina HiSeq 2500 | SAMEA4497179 | metagenome
08/11/2021 01:54:10 AM INFO: Kingfisher done.

Can I ask what git version you are on, (and whether updating to main's HEAD fixes it)? Thanks.

wwood commented 3 years ago

Oh I'm sorry, I retested on python3.9 and got your error (it worked fine with 3.6). I'm on it.

harper357 commented 3 years ago

No worries. Thanks!

wwood commented 3 years ago

Better, I hope?