jibsch / Socrates

Socrates: Identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads
6 stars 6 forks source link

problem with gene annotation #11

Closed ngdxbx closed 10 years ago

ngdxbx commented 10 years ago

I have a bed file as follow: chr1 69090 70008 OR4F5 chr1 367658 368597 OR4F29 chr1 621095 622034 OR4F16 chr1 861321 879533 SAMD11 chr1 880073 894620 NOC2L

I used the following command sortBed - i myfile.bed > myfile.sort.bed bgzip myfile.sort.bed tabix myfile.sort.bed.gz

Socrates annotate --features myfile.sort.bed.gz results_Socrates_paired_myresult.txt I got error message about "ArrayIndexOutOfBoundsException:".

It will be very helpful if you could kindly provide tabix bed files for gene annotation and repeatmasker that work for your script.

jibsch commented 10 years ago

Hi, We'd rather not put those files for download, so that people don't have to work with potentially outdated data. The steps you took to index the annotation track are correct. I took your example from above with the exact file names and commands, and it runs through without complaints, so it is difficult at this stage to understand what the problem is -- can you elaborate on the details of the exception? One thing, I could think of: does your file name indeed include ".bed". If not, Socrates will assume it to be a different format, and look in column 12 for the annotation (as described in the help text), which could lead to such an exception. Please let me know the details of the exception, or if manipulating the file name helps. Thanks

ngdxbx commented 10 years ago

Great. Thanks a lot. And you are right: the file is ".txt" with bed format (4 cols). I thought Socrates can detect the format automatically. I changed it to ".bed". It ran though.

On Sat, Jun 14, 2014 at 12:42 AM, jibsch notifications@github.com wrote:

Hi, We'd rather not put those files for download, so that people don't have to work with potentially outdated data. The steps you took to index the annotation track are correct. I took your example from above with the exact file names and commands, and it runs through without complaints, so it is difficult at this stage to understand what the problem is -- can you elaborate on the details of the exception? One thing, I could think of: does your file name indeed include ".bed". If not, Socrates will assume it to be a different format, and look in column 12 for the annotation (as described in the help text), which could lead to such an exception. Please let me know the details of the exception, or if manipulating the file name helps. Thanks

— Reply to this email directly or view it on GitHub https://github.com/jibsch/Socrates/issues/11#issuecomment-46078454.

I'm a big big fish in a big ocean!