zyxue / ncbitax2lin

🐞 Convert NCBI taxonomy dump into lineages
MIT License
138 stars 29 forks source link

~\multiprocessing\pool.py", line 771, in get raise self._value: KeyError: 1 #14

Closed naurasd closed 3 years ago

naurasd commented 3 years ago

Hi,

was trying to youse your tool but got a KeyError: 1 with python multiprocessing. Any idea what could be the issue? Here is the error output:

Traceback (most recent call last):
File "c:\users\nauras\programs\python\python39\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "c:\users\nauras\programs\python\python39\lib\multiprocessing\pool.py", line 48, in mapstar
return list(map(*args))
File "c:\users\nauras\programs\python\python39\lib\site-packages\ncbitax2lin\ncbitax2lin.py", line 78, in find_lineage
record = TAXONOMY_DICT[tax_id]
KeyError: 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "c:\users\nauras\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\nauras\programs\python\python39\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\Nauras\Programs\Python\Python39\Scripts\ncbitax2lin.exe_main.py", line 7, in
File "c:\users\nauras\programs\python\python39\lib\site-packages\ncbitax2lin\ncbitax2lin.py", line 192, in main
fire.Fire(taxonomy_to_lineages)
File "c:\users\nauras\programs\python\python39\lib\site-packages\fire\core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "c:\users\nauras\programs\python\python39\lib\site-packages\fire\core.py", line 463, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "c:\users\nauras\programs\python\python39\lib\site-packages\fire\core.py", line 672, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "c:\users\nauras\programs\python\python39\lib\site-packages\ncbitax2lin\ncbitax2lin.py", line 179, in taxonomy_to_lineages
lineages = find_all_lineages(df_data.tax_id)
File "c:\users\nauras\programs\python\python39\lib\site-packages\ncbitax2lin\ncbitax2lin.py", line 101, in find_all_lineages
return pool.map(find_lineage, tax_ids)
File "c:\users\nauras\programs\python\python39\lib\multiprocessing\pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "c:\users\nauras\programs\python\python39\lib\multiprocessing\pool.py", line 771, in get
raise self._value
KeyError: 1`

Cheers, Nauras

zyxue commented 3 years ago

How are you running the ncbi2taxlin?

Also, can you format your error better with triple back quotes?

naurasd commented 3 years ago

Thanks for getting back so quickly. I used the insert code function in the comment panel but this didn't display the error output correctly. sorry about that, will use ''' next time.

I am running the usual command in gitbash:

ncbi2taxlin nodes.dmp names.dmp

I used your standard command to install your tool and even downloaded the taxdump.tar.gz file again to make sure this is not about corrupted NCBI files.

naurasd commented 3 years ago

Sorry, should have said I am running it on a Windows machine, too

zyxue commented 3 years ago

Sorry, I'm not familiar with windows, and can't debug it. Could you try on a linux or macos machine?

btw, I updated your comment with triple quotes

naurasd commented 3 years ago

No, cannot try on linux or macos, sorry. So you are not familiar with this error message when using your tool? Has never occurred before?

zyxue commented 3 years ago

Initially, I thought it might have something to do with multiprocessing module, which is likely implemented differently between linux and windows.

Now looking at it again, it's complaining that tax_id = 1 is not available in TAXONOMY_DICT, where are your nodes.dmp and names.dmp from? Are you from ncbi directly, or you modified them somehow?

naurasd commented 3 years ago

I downloaded them with the command you suggest. This and the unzipping works fine, I just tried again. But running ncbi2taxlin just resulted in the same error

zyxue commented 3 years ago

I don't know why it doesn't work for you, I just reran it and it finishes successfully.

naurasd commented 3 years ago

Might be a Windows issue then. Would you by any chance be able to send me such a .csv.gz file created from the most recent names.dmp and nodes.dmp files? I would really appreciate it as your tool is a very nice solution but there seems to be some kind of bug running it on Windows. Thanks so much Nauras

zyxue commented 3 years ago

https://gitlab.com/zyxue/ncbitax2lin-lineages/-/blob/master/ncbi_lineages_2020-12-01.csv.gz

naurasd commented 3 years ago

Thanks so much. will close this issue for now as I don't think there is an obvious solution to the error occurring. Cheers Nauras