arzwa / wgd

Python package and CLI for whole-genome duplication related analyses. **This package is deprecated in favor of** https://github.com/heche-psb/wgd.
http://wgd.readthedocs.io/en/latest/
GNU General Public License v3.0
84 stars 41 forks source link

wgd syn: multiplicons.txt file issue #65

Open chrisjackson-pellicle opened 3 years ago

chrisjackson-pellicle commented 3 years ago

Hi,

Thanks very much for this tool!

I've run in to an issue running wgd syn. I've pasted the traceback below. This appears to be caused by the following code in the syntenic_dotplot function in viz.py:

genomic_elements_ = { x: 0 for x in list(set(df['list_x']) | set(df['list_y'])) if type(x) == str }

My i-ADHoRE (version 3.0) multiplicons.txt from the wgd run is here: multiplicons.txt.

As you can see, my list_x column has empty rows; these get read by pandas as nan, and this results in pandas converting the actual integers into floats. Due to the check for type string, the genomic_elements dictionary is empty, causing a bunch of downstream issues. However, the float-vs-integer issue also doesn't seem to be the root cause, as the dictionary would be empty even if the integers were of type integer in the pandas dataframe.

I've not used i-ADHoRE before - does my multiplicons.txt file look okay, and, if so, can you suggest a fix for the wgd issue?

My pandas version is 0.24.1.

Thanks very much,

Chris

Traceback (most recent call last): File "/home/cjackson/.local/bin/wgd", line 10, in sys.exit(cli()) File "/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py", line 764, in call return self.main(args, kwargs) File "/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py", line 555, in invoke return callback(args, kwargs) File "/home/cjackson/.local/lib/python3.6/site-packages/wgd_cli.py", line 857, in syn gene_attribute, min_length, ks_range, iadhore_opts File "/home/cjackson/.local/lib/python3.6/site-packages/wgdcli.py", line 949, in syn '{}.dotplot.svg'.format(os.path.basename(families))) File "/home/cjackson/.local/lib/python3.6/site-packages/wgd/viz.py", line 244, in syntenic_dotplot [row['begin_x'], row['end_x']]] File "/home/cjackson/.local/lib/python3.6/site-packages/wgd/viz.py", line 243, in x = [genomic_elements[curr_list_x] + x for x in KeyError: 92.0

YuJiang0121 commented 2 weeks ago

hello Have you solved the problem? I have the same problem and am looking for help!

你好

非常感谢这个工具!

我在运行 . 时遇到问题。我在下面粘贴了回溯。这似乎是由 中的函数中的以下代码引起的:wgd syn``syntenic_dotplot``viz.py

genomic_elements_ = { x: 0 for x in list(set(df['list_x']) | set(df['list_y'])) if type(x) == str }

我在 wgd 运行中的 i-ADHoRE(版本 3.0)multiplicons.txt在这里:multiplicons.txt

如您所见,我的列有空行;这些被 pandas 读取为 ,这会导致 pandas 将实际的整数转换为浮点数。由于检查 type ,字典为空,导致一系列下游问题。但是,float vs-integer 问题似乎也不是根本原因,因为即使整数 pandas dataframe 中的类型,字典也会为空。list_x``nan``string``genomic_elements``integer

我以前没有使用过 i-ADHoRE - 我的 multiplicons.txt 文件看起来还不错吗,如果是这样,您能否建议解决 wgd 问题?

我的 pandas 版本是 0.24.1。

非常感谢,

克里斯

回溯(最近一次调用最后):文件“/home/cjackson/.local/bin/wgd”,第 10 行,在 sys.exit(cli()) 文件中 文件“/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py”,第 764 行,在调用中 return self.main(args, kwargs) 文件“/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py”,第 717 行,在主 rv = self.invoke(ctx) 中文件 “/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py”,第 1137 行,在调用返回 _process_result(sub_ctx.command.invoke(sub_ctx)) 文件“/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py”,第 956 行,在调用返回 ctx.invoke(self.callback, ctx.params) 文件“/opt/miniconda3/miniconda3/lib/python3.6/site-packages/click/core.py”,第 555 行,在调用返回回调(args, kwargs) 文件 “/home/cjackson/.local/lib/python3.6/site-packages/wgd_cli.py”,第 857 行,在 syn gene_attribute、min_length、ks_range 中,iadhore_opts文件 “/home/cjackson/.local/lib/python3.6/site-packages/wgdcli.py”,第 949 行,在 syn '{}.dotplot.svg'.format(os.path.basename(families))) 文件中 “/home/cjackson/.local/lib/python3.6/site-packages/wgd/viz.py”,第 244 行,syntenic_dotplot[行['begin_x'], 行['end_x']]]文件“/home/cjackson/.local/lib/python3.6/site-packages/wgd/viz.py”,第 243 行,在 x = [genomic_elements[curr_list_x] + x for x 中 KeyError:92.0