stajichlab / AAFTF

Automatic Assembly For The Fungi
MIT License
19 stars 4 forks source link

error when running AAFTF filter and AAFTF vecscreen #21

Closed njliangdong closed 1 year ago

njliangdong commented 1 year ago

Hi, dear AAFTF team, When I running command of AAFTF filter and AAFTF vecscreen, I stuck at trouble as following:

Traceback (most recent call last):
  File "/home/liangdong/opt/anaconda3/bin/AAFTF", line 8, in <module>
    sys.exit(main())
  File "/home/liangdong/opt/anaconda3/lib/python3.10/site-packages/AAFTF/AAFTF_main.py", line 936, in main
    args.func(parser, args)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/site-packages/AAFTF/AAFTF_main.py", line 47, in run_subtool
    submodule.run(parser, args)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/site-packages/AAFTF/vecscreen.py", line 285, in run
    urllib.request.urlretrieve(url, file)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 241, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/home/liangdong/opt/anaconda3/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

I'm not sure what is happened during this procession, is my internet connection issue? if yes, is there any proxy or mirror site I can use in china? anyway, here is my command:

(1)AAFTF filter -c 16 --memory 48 --aligner bbduk -o ${sp}_filter --left ${sp}_trim_1P.fastq.gz --right ${sp}_trim_2P.fastq.gz --pipe --AAFTF_DB ./ref_genome
(2)AAFTF vecscreen -c 16 -i $sp.spades.assembly.fa -o $sp.assembly.vecscreen.out -s high --pipe

and following is my python version and installation path: python version: 3.10.9 (main, Mar 1 2023, 18:23:06) [GCC 11.2.0] on linux installation path: /home/liangdong/opt/anaconda3/bin/python I installed AAFTF by pip install

Thanks and best regards

nextgenusfs commented 1 year ago

Yes I think its probably your internet and unable to download the vecscreen database from NCBI. For vecscreen it is trying to download the following resources from NCBI:

NCBI='https://ftp.ncbi.nlm.nih.gov'
    'UniVec': [NCBI+'/pub/UniVec/UniVec'],
    'CONTAM_EUKS': [NCBI + '/pub/kitts/contam_in_euks.fa.gz'],
    'CONTAM_PROKS': [NCBI + '/pub/kitts/contam_in_prok.fa'],
    'MITO': [NCBI + '/refseq/release/mitochondrion/' +
             'mitochondrion.1.1.genomic.fna.gz',
             NCBI + '/refseq/release/mitochondrion/' +
             'mitochondrion.2.1.genomic.fna.gz']
DianaOaxaca commented 1 year ago

Hi, I got the same error when I ran the filter step, but I haven´t a connection or download problems. Could you please guide me with this error?

hyphaltip commented 1 year ago

I think NCBI moved the folder - you are welcome to sync this folder from our system https://cluster.hpcc.ucr.edu/~jstajich/AAFTF/ into your local AAFTF_DB folder (you set the env variable $AAFTF_DB to point to whatever folder you want).

We need to put these up in another permanent location. I added a new set of steps to use NCBI fcs screening tool which will replace some of this too but filter is helpful for improving assembly if you have contaminants.

I don't have a lot of time right now to work on it but I can try to integrate what we need into a more persistent location on osf.io or zenodo or in another way.

DianaOaxaca commented 1 year ago

@hyphaltip thank you very much for the answer, it works fine for me :)

hyphaltip commented 1 year ago

I've fixed the automatic download of the DBs in sourpurge now and old genbank-k31-lca is now available again via OSF.io URLs. You just need a writeable AAFTF_DB (can pass on cmdline or set env variable)