lozuponelab / AMON

Annotation of Metabolite Origin via Networks: A tool for predicting putative metabolite origins for microbes or between microbes and host with or without metabolomics data
MIT License
21 stars 11 forks source link

Error in specific samples #9

Open Lucas-Maciel opened 4 years ago

Lucas-Maciel commented 4 years ago

Hi,

I'm using AMON in my metagenomic data. I have 79 MAGs, and in 70 I was able to run it without problems. But for 9 of them I get the following error. I believe that it may be due to a non-recognized KO annotation, but I don't know how to figure out which ones.

amon.py -i ko_list.txt -o ../teste Traceback (most recent call last): File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/bin/amon.py", line 74, in <module> main(kos_loc, output_dir, other_kos_loc, detected_compounds, name1, name2, keep_separated, samples_are_columns, File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/AMON/predict_metabolites.py", line 283, in main ko_dict = get_kegg_record_dict(set(all_kos), parse_ko, ko_file_loc) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 55, in get_kegg_record_dict records = get_from_kegg_api(loop, list_of_ids, parser) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 49, in get_from_kegg_api return [parser(raw_record) for raw_record in loop.run_until_complete(kegg_download_manager(loop, list_of_ids))] File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete return future.result() File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 43, in kegg_download_manager results = await asyncio.gather(*tasks) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 30, in download_coroutine return await response.text() File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 1014, in text return self._body.decode(encoding, errors=errors) # type: ignore UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 80011: invalid start byte

Lucas-Maciel commented 4 years ago

Trying to install in other machine I've got other errors

Traceback (most recent call last): File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/bin/amon.py", line 74, in <module> main(kos_loc, output_dir, other_kos_loc, detected_compounds, name1, name2, keep_separated, samples_are_columns, File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/AMON/predict_metabolites.py", line 283, in main ko_dict = get_kegg_record_dict(set(all_kos), parse_ko, ko_file_loc) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 55, in get_kegg_record_dict records = get_from_kegg_api(loop, list_of_ids, parser) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 49, in get_from_kegg_api return [parser(raw_record) for raw_record in loop.run_until_complete(kegg_download_manager(loop, list_of_ids))] File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete return future.result() File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 43, in kegg_download_manager results = await asyncio.gather(*tasks) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 35, in download_coroutine raise ValueError('KEGG has forbidden request after %s attempts' % attempts) ValueError: KEGG has forbidden request after 10 attempts

Tim-Sto commented 4 years ago

Hi Lucas-Maciel,

Did you find a solution for the second error that you got in March (KEGG has forbidden request after 10 attempts). I am trying to run AMON and I get the same error message.

Lucas-Maciel commented 4 years ago

@Tim-Sto, to tell the truth, I still don't really understand why but I was able to make it work. if you try to run like 10 times the same input, one of them may work. Another thing that sometimes works and I also don't know why is, if you have a file with 1000 lines and it is not working you can use head -n 1000 files.txt > ko_list.txt and run with this new file.

Tim-Sto commented 4 years ago

@Lucas-Maciel, Thank you very much for your answer. I tried to run the same input 10 times and I also tried to use the head command, but it both did not work. However, I think, I figured out what the problem is. I used a KO list based on the human genome (more than 10.000 KOs) as input for the host. As mentioned in the code, the KEGG API has download limits for those not having a subscription, and probably the limits are reached with a list of 10.000 KOs. @kthurimella, did you already find a workaround for this problem?

shafferm commented 4 years ago

Hey @Lucas-Maciel and @Tim-Sto,

We have looked for ways around this but we have never been able to find the limitations of the KEGG API. In some documentation they mention that it is a rate limitation (e.g. no more than 1000 requests per minute) but they never say what the rate is. My recommendation is to run subsets like @Lucas-Maciel said. If we knew what the KEGG API limits were we could set up AMON to only poll their servers within this limit but since they don't all we can do is guess. We haven't found any better parameters than the ones set as default in AMON to get around it. You can also try using the --save_entries flag to save the output of the KEGG API in json format and then you could analyze the results manually. AMON does not currently support taking those json files as input.

Sorry about the lack of an answer but it seems suprisingly hard to find info in this area.

Mike

vindarbot commented 3 years ago

Same problem, Do you know where can i find these files:

--ko_file_loc KO_FILE_LOC Location of ko file from KEGG FTP download (default: None) --rn_file_loc RN_FILE_LOC Location of reaction file from KEGG FTP download (default: None) --co_file_loc CO_FILE_LOC Location of compound file from KEGG FTP download (default: None) --pathway_file_loc PATHWAY_FILE_LOC

In order to not requesting KEGG ?

Thank's by advance

sterrettJD commented 1 year ago

Hi all, I believe I've fixed this issue with the latest release of KEGG_Parser (which is now bumped to 0.0.7 to fix pip compatibility issues). If the asynchronous downloads are forbidden (due to the request rate being too high), it will download the each url from the KEGG API sequentially. This is quite a bit slower, but it does get around the issue.

raeshrode commented 9 months ago

Hello @sterrettJD thank you for updating KEGG_Parser! I am using version 0.0.7 but unfortunately I am getting the same error as @Lucas-Maciel. I am thinking of downloading the KEGG FTP files. Where can I find those? @vindarbot were you able to locate them?

Thanks! :)

Trying to install in other machine I've got other errors

Traceback (most recent call last): File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/bin/amon.py", line 74, in <module> main(kos_loc, output_dir, other_kos_loc, detected_compounds, name1, name2, keep_separated, samples_are_columns, File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/AMON/predict_metabolites.py", line 283, in main ko_dict = get_kegg_record_dict(set(all_kos), parse_ko, ko_file_loc) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 55, in get_kegg_record_dict records = get_from_kegg_api(loop, list_of_ids, parser) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 49, in get_from_kegg_api return [parser(raw_record) for raw_record in loop.run_until_complete(kegg_download_manager(loop, list_of_ids))] File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete return future.result() File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 43, in kegg_download_manager results = await asyncio.gather(*tasks) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 35, in download_coroutine raise ValueError('KEGG has forbidden request after %s attempts' % attempts) ValueError: KEGG has forbidden request after 10 attempts

sterrettJD commented 9 months ago

Hey @raeshrode , that's weird - I'll look into it! In the meantime, can you post the error from your computer + all the versions for your packages (output of conda list)?

Regarding the KEGG FTP, those files can be accessed here, but unfortunately you need to be a KEGG subscriber to download them :/ which is why we have to download things from KEGG individually

raeshrode commented 9 months ago

Thank you for the quick response @sterrettJD ! Bummer on the KEGG subscription, but thank you for the link to that too.

My AMON environment packages and versions:

# packages in environment at /Users/rshrode/miniconda3/envs/AMON:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
aiohttp                   3.9.1           py312h98912ed_0    conda-forge
aiosignal                 1.3.1              pyhd8ed1ab_0    conda-forge
alsa-lib                  1.2.10               hd590300_0    conda-forge
asyncio                   3.4.3                    pypi_0    pypi
attr                      2.5.1                h166bdaf_1    conda-forge
attrs                     23.2.0             pyh71513ae_0    conda-forge
biom-format               2.1.15          py312h98912ed_1    conda-forge
brotli                    1.1.0                hd590300_1    conda-forge
brotli-bin                1.1.0                hd590300_1    conda-forge
bzip2                     1.0.8                hd590300_5    conda-forge
c-ares                    1.24.0               hd590300_0    conda-forge
ca-certificates           2023.11.17           hbcca054_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cairo                     1.18.0               h3faef2a_0    conda-forge
certifi                   2023.11.17         pyhd8ed1ab_0    conda-forge
charset-normalizer        3.3.2                    pypi_0    pypi
click                     8.1.7           unix_pyh707e725_0    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
contourpy                 1.2.0           py312h8572e83_0    conda-forge
cycler                    0.12.1             pyhd8ed1ab_0    conda-forge
dbus                      1.13.6               h5008d03_3    conda-forge
exceptiongroup            1.2.0              pyhd8ed1ab_0    conda-forge
expat                     2.5.0                hcb278e6_1    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_1    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.47.0          py312h98912ed_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
frozenlist                1.4.1           py312h98912ed_0    conda-forge
gettext                   0.21.1               h27087fc_0    conda-forge
glib                      2.78.3               hfc55251_0    conda-forge
glib-tools                2.78.3               hfc55251_0    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
gst-plugins-base          1.22.8               h8e1006c_1    conda-forge
gstreamer                 1.22.8               h98fc4e7_1    conda-forge
h5py                      3.10.0          nompi_py312h1b477d7_101    conda-forge
harfbuzz                  8.3.0                h3d44ed6_0    conda-forge
hdf5                      1.14.3          nompi_h4f84152_100    conda-forge
icu                       73.2                 h59595ed_0    conda-forge
idna                      3.6                pyhd8ed1ab_0    conda-forge
iniconfig                 2.0.0              pyhd8ed1ab_0    conda-forge
kegg-parser               0.0.7                    pypi_0    pypi
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5           py312h8572e83_1    conda-forge
krb5                      1.21.2               h659d440_0    conda-forge
lame                      3.100             h166bdaf_1003    conda-forge
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libaec                    1.1.2                h59595ed_1    conda-forge
libblas                   3.9.0           20_linux64_openblas    conda-forge
libbrotlicommon           1.1.0                hd590300_1    conda-forge
libbrotlidec              1.1.0                hd590300_1    conda-forge
libbrotlienc              1.1.0                hd590300_1    conda-forge
libcap                    2.69                 h0f662aa_0    conda-forge
libcblas                  3.9.0           20_linux64_openblas    conda-forge
libclang                  15.0.7          default_hb11cfb5_4    conda-forge
libclang13                15.0.7          default_ha2b6cf4_4    conda-forge
libcups                   2.3.3                h4637d8d_4    conda-forge
libcurl                   8.5.0                hca28451_0    conda-forge
libdeflate                1.19                 hd590300_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libevent                  2.1.12               hf998b51_1    conda-forge
libexpat                  2.5.0                hcb278e6_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libflac                   1.4.3                h59595ed_0    conda-forge
libgcc-ng                 13.2.0               h807b86a_3    conda-forge
libgcrypt                 1.10.3               hd590300_0    conda-forge
libgfortran-ng            13.2.0               h69a702a_3    conda-forge
libgfortran5              13.2.0               ha4646dd_3    conda-forge
libglib                   2.78.3               h783c2da_0    conda-forge
libgomp                   13.2.0               h807b86a_3    conda-forge
libgpg-error              1.47                 h71f35ed_0    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
liblapack                 3.9.0           20_linux64_openblas    conda-forge
libllvm15                 15.0.7               hb3ce162_4    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libogg                    1.3.4                h7f98852_1    conda-forge
libopenblas               0.3.25          pthreads_h413a1c8_0    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpng                    1.6.39               h753d276_0    conda-forge
libpq                     16.1                 h33b98f1_7    conda-forge
libsndfile                1.2.2                hc60ed4a_1    conda-forge
libsqlite                 3.44.2               h2797004_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              13.2.0               h7e041cc_3    conda-forge
libsystemd0               255                  h3516f8a_0    conda-forge
libtiff                   4.6.0                ha9c0a0a_2    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libvorbis                 1.3.7                h9c3ff4c_0    conda-forge
libwebp-base              1.3.2                hd590300_0    conda-forge
libxcb                    1.15                 h0b41bf4_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxkbcommon              1.6.0                hd429924_1    conda-forge
libxml2                   2.12.3               h232c23b_0    conda-forge
libzlib                   1.2.13               hd590300_5    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
matplotlib                3.8.2           py312h7900ff3_0    conda-forge
matplotlib-base           3.8.2           py312he5832f3_0    conda-forge
matplotlib-venn           0.11.9                   pypi_0    pypi
mpg123                    1.32.3               h59595ed_0    conda-forge
multidict                 6.0.4           py312h98912ed_1    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
mysql-common              8.0.33               hf1915f5_6    conda-forge
mysql-libs                8.0.33               hca2cd23_6    conda-forge
ncurses                   6.4                  h59595ed_2    conda-forge
nspr                      4.35                 h27087fc_0    conda-forge
nss                       3.96                 h1d7d5a4_0    conda-forge
numpy                     1.26.3          py312heda63a1_0    conda-forge
openjpeg                  2.5.0                h488ebb8_3    conda-forge
openssl                   3.2.0                hd590300_1    conda-forge
packaging                 23.2               pyhd8ed1ab_0    conda-forge
pandas                    2.1.4           py312hfb8ada1_0    conda-forge
patsy                     0.5.6              pyhd8ed1ab_0    conda-forge
pcre2                     10.42                hcad00b1_0    conda-forge
pillow                    10.2.0          py312hf3581a9_0    conda-forge
pip                       23.3.2             pyhd8ed1ab_0    conda-forge
pixman                    0.43.0               h59595ed_0    conda-forge
pluggy                    1.3.0              pyhd8ed1ab_0    conda-forge
ply                       3.11                       py_1    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pulseaudio-client         16.1                 hb77b528_5    conda-forge
pyparsing                 3.1.1              pyhd8ed1ab_0    conda-forge
pyqt                      5.15.9          py312h949fe66_5    conda-forge
pyqt5-sip                 12.12.2         py312h30efb56_5    conda-forge
pytest                    7.4.4              pyhd8ed1ab_0    conda-forge
python                    3.12.1          hab00c5b_1_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-tzdata             2023.4             pyhd8ed1ab_0    conda-forge
python_abi                3.12                    4_cp312    conda-forge
pytz                      2023.3.post1       pyhd8ed1ab_0    conda-forge
qt-main                   5.15.8              h450f30e_18    conda-forge
readline                  8.2                  h8228510_1    conda-forge
requests                  2.31.0                   pypi_0    pypi
scipy                     1.11.4          py312heda63a1_0    conda-forge
seaborn                   0.13.1               hd8ed1ab_0    conda-forge
seaborn-base              0.13.1             pyhd8ed1ab_0    conda-forge
setuptools                69.0.3             pyhd8ed1ab_0    conda-forge
sip                       6.7.12          py312h30efb56_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
statsmodels               0.14.1          py312hc7c0aa3_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tornado                   6.3.3           py312h98912ed_1    conda-forge
tqdm                      4.66.1                   pypi_0    pypi
tzdata                    2023d                h0c530f3_0    conda-forge
urllib3                   2.1.0                    pypi_0    pypi
wheel                     0.42.0             pyhd8ed1ab_0    conda-forge
xcb-util                  0.4.0                hd590300_1    conda-forge
xcb-util-image            0.4.0                h8ee46fc_1    conda-forge
xcb-util-keysyms          0.4.0                h8ee46fc_1    conda-forge
xcb-util-renderutil       0.3.9                hd590300_1    conda-forge
xcb-util-wm               0.4.1                h8ee46fc_1    conda-forge
xkeyboard-config          2.40                 hd590300_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.1.1                hd590300_0    conda-forge
xorg-libsm                1.2.4                h7391055_0    conda-forge
xorg-libx11               1.8.7                h8ee46fc_0    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxrender           0.9.11               hd590300_0    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xf86vidmodeproto     2.3.1             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yarl                      1.9.3           py312h98912ed_0    conda-forge
zlib                      1.2.13               hd590300_5    conda-forge
zstd                      1.5.5                hfc55251_0    conda-forge

My error:

Asynchronous downloading of KEGG records has failed. KEGG parser will try to download data sequentially.This will be slower.
Total urls to download: 1359. Progress will be shown below.
0% |
1 0/1359 [00:10<?, ?it/s]
Traceback (most recent call last) :
File "/Users/rshrode/miniconda3/lib/python3.8/site-packages/KEGG_parser/downloader.py",line78,inget_from_kegg_api
return [parser(raw_record) for raw_record in loop.run_until_complete(kegg_download_manager(loop, list_of_ids)) ]
File "/Users/rshrode/miniconda3/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/Users/rshrode/miniconda3/lib/python3.8/site-packages/KEGG_parser/downloader.py"
, line 47, in kegg_download manager
results = await asyncio.gather(*tasks)
File "/Users/rshrode/miniconda3/lib/python3.8/site-packages/KEGG_parser/downloader.py",line38,indownload_coroutine
raise ValueError('KEGG has forbidden request after %s attempts for url %s
which returns a response status of %s'
%
ValueError: KEGG has forbidden request after 10 attempts for url http: //rest.kegg.jp/get/K19756+K02335+K17597+K10099+K04207+K04000+06547+K26129+K02616+K25102
hich returns a response status of 403
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/rshrode/miniconda3/bin/amon.py", line 74, in <module>
main(kos_loc, output_dir, other_kos_loc, detected compounds, name1, name2, keep_separated,
samples are columns,
File "/Users/rshrode/miniconda3/lib/ovthon3.8/site-packages/AMON/predict metabolites.py", line 283, in main
ko dict = get kegg record dict(set(all kos), parse ko, ko file loc)
File "/Users/rshrode/miniconda3/lib/python3.8/site-packages/KEGG_parser/downloader.py",line88,inget_kegg_record_dict
records = get_from_kegg_api(loop, list of ids, parser)
File "/Users/rshrode/miniconda3/lib/python3.8/site-packages/KEGG parser/downloader.py", line 83, in get from kegg api
return [parser (raw record) for raw record in kegg download manager synchronous (list of ids) ]
File
")Users/rshrode/miniconda3/lib/ovthon3.8/site-backades/KEGGparser/downloader.py™,line69,inkegg_download_manager_synchronous
results.append (download svnchronous(url))
File "/Users/rshrode/miniconda3/lib/python3.8/site-packages/KEGG parser/downloader.py", line 59, in download synchronous
raise ValueError("KEGG has forbidden reguest after %s attempts for url %s
which returns a response status of %s
%
ValueError: KEGG has forbidden request after 10 attempts for urlhtto://rest.kega.ip/aet/K14648+K13903+K06473+K07522+K09456+K12076+K23204+K06910+K04999+K09203
hich returns a response status of 403

Thank you!

sterrettJD commented 8 months ago

Hey @raeshrode , it looks like KEGG_parser is requesting a weird url... In that last line,

htto://rest.kega.ip/aet/K14648+K13903+K06473+K07522+K09456+K12076+K23204+K06910+K04999+K09203 should be http://rest.kegg.jp/get/K14648+K13903+K06473+K07522+K09456+K12076+K23204+K06910+K04999+K09203.

(htto -> http; kega -> kegg; ip -> jp; aet -> get)

I haven't seen this before, and I'm not sure how this string is getting corrupted. Would you be able to email me the command/input data you're using for AMON (john.sterrett@colorado.edu)? I can see if I get the same error on my end.

I could be wrong, but I think that this may be a different issue from what Lucas was dealing with. In this case, AMON is attempting to download the KEGG data in parallel, then when that fails, it's attempting to download the data not in parallel. Lucas's error was due to hitting limits in the number of requests allowed per minute by KEGG, but this error seems to be related to some corruption of the URL string requested...

sterrettJD commented 8 months ago

I tested with @raeshrode 's data and was getting the 403 error but no weird url. I think KEGG now "forbids" requests for longer once a requester is "banned"... That means that the strategy of attempt a parallel download, then try non-parallel if that fails no longer works because users will still be forbidden from KEGG requests when trying to download the data not in parallel for like 30(?) minutes.

Anyway, I've updated KEGG_parser to have an option to not try the parallel downloading that seems to be causing the issue, and I've changed the default behavior of AMON to skip the parallel download attempt. Parallel downloading in AMON can be re-enabled using --download_kegg_async, but I'd recommend against that for now. This unfortunately makes things much slower :( (it'll probably take 60-90 minutes for the downloads)

Rachel, can you try updating AMON -> v1.0.1 and kegg_parser -> v0.0.8, and see if that fixes things? It does on my end (with your data). There's an error downstream when calculating enrichment, but that may be because you're only using one species for the microbial side. I'm hoping/assuming that'll go away once you add more taxa into the mix.