bioinform / rnacocktail

Other
88 stars 48 forks source link

Exception: Unable to detect format from ['SNV;ENSG00000225630', '1', '+', '50', '0'] #12

Closed WYSNI closed 5 years ago

WYSNI commented 6 years ago

Hi,

First thanks for this very complete pipeline.

I have a problem with the step 5 of the editing mode.

error

I think the problem is in line 88 of the editing.py file and more precisely on the merge_info_SNV function.

image

The exception is raised in line 90 of this file because (I think again), the cat() function is not possible between SNV_fwd and SNV_fwd1.

This is a capture of the SNV_no var:

image

a capture of the "feature" var on the merge_info_SNV function:

image

and of my vcf file :

image

Do you have any idea of what's going on?

Thanks!

msahraeian commented 6 years ago

@WYSNI Happy to see your interest in RNACocktail. Would you please let me know what version of bedtools and pybedtools you are using?

WYSNI commented 6 years ago

I'm using pybedtools 0.7.7 and bedtools 2.26.0 on a conda environment. Is it a bedtools version problem in your opinion?

msahraeian commented 6 years ago

@WYSNI I recall bedtools 2.26.0 had some issues with groupby. Would you please use later versions of bedtools like 2.27 ?

WYSNI commented 6 years ago

It's works! Thanks a lot

image

I don't understand why GIREMI return a 1 code in steps 6 and 9 despite I have giremi_out.txt.res and it seems good

giremi_out.txt.res

image

giremi.log image

Any idea?

msahraeian commented 6 years ago

@WYSNI the first return code of 1 is normal, as we do it in two iterations. For the second one you can check the giremi.log file at work/giremi/A for more specific logs. In my experience as long as giremi_out.txt.res exists and has the ifRNA column the predictions should be OK. You can also check whether giremi_out.txt.res covers all your region. Also, check how many 1's you have in the if RNA (last) column.

WYSNI commented 6 years ago

Ok Thanks! I have some "1" in my ifRNAE column so I think the result is ok. Last question: can I used files on the test directory ( GRCh37_genes_pos.bed.gz, GRCh37_strand_pos.bed.gz, GRCh38_genes_pos.bed.gz, GRCh38_strand_pos.bed.gz) for my experiments or are they not complete ?

msahraeian commented 6 years ago

@WYSNI Yes they are complete for whole genome GRCh37 and GRCh38 references.

WYSNI commented 6 years ago

Thanks a lot!

I close this issue.

WYSNI commented 5 years ago

Hi,

Sorry, I reopen this issue because I have another problem with the step 5 of the editing mode...

image

I think the problem is again in line 88 of the editing.py file but with the groupby now.

I'm on a conda environment with bedtools 2.27.1 and pybedtools 0.7.7. My environment (RNACocktail) path is in the first place of my $PATH:

image

I don't understand what's going on! Any suggestion?

Thanks, WYSNI

WYSNI commented 5 years ago

Ok my bad, it's just a problem of chromosome name on GRCh38 files of the test/ repository.

Sorry!

I (re)closed this issue.