conchoecia / odp

oxford dot plots
GNU General Public License v3.0
134 stars 10 forks source link

Error in rule analysis_D_and_FET #72

Closed minglibio closed 6 months ago

minglibio commented 6 months ago

Hi @conchoecia

When I run odp, I meet the following error. Do you have any ideas about this error.

Calculating FET of the LG xxx and the species xxx.
Calculating D of the LG xxx and the species xxx.
thisdir_dfs
 []
RuleException:
ValueError in file /odp/scripts/odp, line 701:
No objects to concatenate
  File "/odp/scripts/odp", line 1948, in __rule_analysis_D_and_FET
  File "odp/scripts/odp", line 701, in calc_D_for_y_and_x
  File "/odp/lib/python3.12/site-packages/pandas/core/reshape/concat.py", line 382, in concat
  File "/odp/lib/python3.12/site-packages/pandas/core/reshape/concat.py", line 445, in __init__
  File "/odp/lib/python3.12/site-packages/pandas/core/reshape/concat.py", line 507, in _clean_keys_and_objs
[Wed May  8 10:58:29 2024]
Error in rule analysis_D_and_FET:
    jobid: 0
    input: odp/step1-rbh/xxx_xxx_reciprocal_best_hits.rbh, odp/step0-chromsize/analyses/xxx_xxx.chromsize
    output: odp/step1-rbh/xxx_xxx_reciprocal_best_hits.D.FET.rbh

Exiting because a job execution failed. Look above for error message
WorkflowError:
At least one job did not complete successfully.

Best, Ming

conchoecia commented 6 months ago

That sounds like there were no significant hits between the two samples. I am assuming that you changed the text to be xxx, correct? Are you able to post the input files... odp/step1-rbh/xxx_xxx_reciprocal_best_hits.rbh, and the odp/step0-chromsize/analyses/xxx_xxx.chromsize file?

minglibio commented 6 months ago

Thanks for your prompt reply! @conchoecia

Yes, I changed the text to show the error more simply.

Below please find the input files.

DanioRerio_LepisosteusOculatus.chromsize.txt DanioRerio_LepisosteusOculatus_reciprocal_best_hits.rbh.txt

victor4110 commented 6 months ago

HI, could you solve the error? I am getting the same error when running

snakemake -j 72 -p --snakefile odp/scripts/odp --configfile ./config.yaml.

I checked the log file but I couldn't find more details.

HX-FAshur commented 6 months ago

Hi @victor4110 and @minglibio! If it helps, I was having the same issue untill I changed the chromosomes headers in the fasta files to reflect what was on the chromosomes header in the chrom files. So, for example, if your .chrom has something like: cytochrome-c-oxidase-subunit-1 C_0 + 1 1587 you'd have to change the genome's .fasta header to ">C_0". This worked and the script stopped throwing the "no objects to concatenate" error but I started getting:

**Empty DataFrame**
Columns: [rbh, CutaneotrichosporonCavernicolaHIS641_gene, CutaneotrichosporonspHIS471_gene, gene_group, CutaneotrichosporonCavernicolaHIS641_scaf, CutaneotrichosporonCavernicolaHIS641_pos, CutaneotrichosporonspHIS471_scaf, CutaneotrichosporonspHIS471_pos, CutaneotrichosporonCavernicolaHIS641_breakchrom, CutaneotrichosporonspHIS471_breakchrom, CutaneotrichosporonCavernicolaHIS641_ix, CutaneotrichosporonCavernicolaHIS641_break_ix, CutaneotrichosporonspHIS471_ix, CutaneotrichosporonspHIS471_break_ix, whole_FET, break_FET, CutaneotrichosporonCavernicolaHIS641_D, CutaneotrichosporonspHIS471_D, color]
**Index: []**
[Sun May 19 19:04:00 2024]
Finished job 11.
15 of 17 steps (88%) done
All the CutaneotrichosporonspHIS471 scaffolds have been identified in a connected component, continuing
All the CutaneotrichosporonCavernicolaHIS641 scaffolds have been identified in a connected component, continuing
**Bus error**
[Sun May 19 19:04:00 2024]
Error in rule plot_synteny_nocolor:
    jobid: 16
    input: odp/step1-rbh/CutaneotrichosporonCavernicolaHIS641_CutaneotrichosporonspHIS471_reciprocal_best_hits.D.FET.rbh, odp/step0-chromsize/analyses/CutaneotrichosporonCavernicolaHIS641_CutaneotrichosporonspHIS471.chromsize
    output: odp/step2-figures/synteny_nocolor/CutaneotrichosporonCavernicolaHIS641_CutaneotrichosporonspHIS471_xy_reciprocal_best_hits.plotted.rbh, odp/step2-figures/synteny_nocolor/CutaneotrichosporonCavernicolaHIS641_CutaneotrichosporonspHIS471_xy_synteny.pdf, odp/step2-figures/synteny_nocolor/CutaneotrichosporonCavernicolaHIS641_CutaneotrichosporonspHIS471_yx_synteny.pdf

Shutting down, this might take some time.

So if any of you or @conchoecia have any clues on how to solve this, I'd appreciate it!

victor4110 commented 6 months ago

Hi @HX-FAshur, yes that solved the error! Thank you!

minglibio commented 6 months ago

Hi @HX-FAshur , Thanks for the reply! I checked my chrom file and the protein and genome sequences files, but cannot see any inconsistencies among them. Maybe I misunderstood what you meant? Following is the example of my input files. chrom file:

Loc_XP_006625329.2  LG1 -   4502    13178
Loc_XP_006625330.1  LG1 -   15641   19299
Loc_XP_015212439.1  LG1 -   24649   36794
Loc_XP_006625332.1  LG1 +   37609   58077

protein file:

>Loc_XP_006625329.2
XXX
>Loc_XP_006625330.1
XXX
>Loc_XP_015212439.1
XXX
>Loc_XP_006625332.1
XXX

genome file:

>LG1
XXX
>LG2
XXX

Maybe @conchoecia can test this error when you have time? if so, I can provide the input files to reproduce this error.

Best, Ming

minglibio commented 6 months ago

I capture the bug. It is caused by the numerical chromosome names. Adding a non-numeric prefix in the chromosome names solved the problem. e.g. change 1 to LG1