stjude / ROSE

ROSE: RANK ORDERING OF SUPER-ENHANCERS
Other
40 stars 12 forks source link

IndexError: list index out of range when applying the example data #10

Closed JediWizard closed 3 years ago

JediWizard commented 3 years ago

Dear ‘ROSE’ developers, Thank you very much for developing ‘ROSE’! It is of a great interest for us to be able and use your software for discovering super enhancer within our system and we would be very appreciable if you could advise us what could be amend in our procedure in order to perform a successful ‘ROSE’ run. We have recently downloaded the current ROSE package from you Github: https://github.com/stjude/ROSE . We also downloaded the example library that goes along with ‘ROSE’, which you deposited under ‘example data for ROSE’: http://younglab.wi.mit.edu/super_enhancer_code.html This is the command line we used: ROSE_main.py -g hg18 -i HG18_MM1S_MED1.gff -r MM1S_MED1.hg18.bwt.sorted.bam -c MM1S_WCE.hg18.bwt.sorted.bam -o results It returned that "USING HG18_MM1S_MED1.gff AS THE INPUT GFF USING hg18 AS THE GENOME MAKING START DICT Traceback (most recent call last): File "/home/hjc/BioSoft/ROSE-master/bin/ROSE_main.py", line 496, in main() File "/home/hjc/BioSoft/ROSE-master/bin/ROSE_main.py", line 332, in main startDict = ROSE_utils.makeStartDict(annotFile) File "/home/hjc/BioSoft/ROSE-master/lib/ROSE_utils.py", line 136, in makeStartDict refseqTable,refseqDict = importRefseq(annotFile) File "/home/hjc/BioSoft/ROSE-master/lib/ROSE_utils.py", line 198, in importRefseq if line[1] in refseqDict: IndexError: list index out of range" We tried to figure out if it was the format of gff file that caused the trouble, but it did not work. The annotation file hg18_refseq.ucsc was downloaded from bitbucket (https://bitbucket.org/young_computation/rose/src/master/annotation/) We are hoping someone could help me understand why I get that error and how to fix it. Any help is appreciated. Thanks a lot in advance! Huang JC 2021/4/29

JediWizard commented 3 years ago

Python 3.7.10, R 3.6.0, Samtools 1.12, and bedtools 2.30.0 have been installed.

madetunj commented 3 years ago

Hi, The IndexError is because the annotation file you downloaded might be incorrect or corrupted. Can you view your "annotation/hg18_refseq.ucsc" to make sure the file has column headers "#bin name chrom strand txStart txEnd cdsStart cdsEnd ..." If not, you will need to download the files using the raw file link (https://bitbucket.org/young_computation/rose/raw/feb35cb1d9556a76f8ac1f51521539bb30651343/annotation/hg18_refseq.ucsc). Hope this resolves the issue.

JediWizard commented 3 years ago

Hi, The IndexError is because the annotation file you downloaded might be incorrect or corrupted. Can you view your "annotation/hg18_refseq.ucsc" to make sure the file has column headers "#bin name chrom strand txStart txEnd cdsStart cdsEnd ..." If not, you will need to download the files using the raw file link (https://bitbucket.org/young_computation/rose/raw/feb35cb1d9556a76f8ac1f51521539bb30651343/annotation/hg18_refseq.ucsc). Hope this resolves the issue.

Thank you so much for your reply. It was the first time for me to analyze the Chipseq data and apply "ROSE" for further analysis. Your suggestion was at the point. It was so careless that I downloaded the "hg18_refseq.ucsc" file in html format, without realising it. Your expertise are admirable. Thank you again for what you contributed. Huang JC 2021/4/30

WangYuzhou1996 commented 3 months ago

Hi, I encountered the same problem, after running ROSE_main,I didn't get the SuperEnhancers.table.txt file, and got the error:IndexError: list index out of range. Did you solve the problem?

JediWizard commented 3 months ago

Hi, I finally figure out what problem was. In my situation, I modified the original code so that the CRCmapper could be applicable for hg38. Therefore, the genome and annotation file should also be replaced by hg38 and hg38_refseq.ucsc. However, I downloaded the wrong file of the annotation file "hg38_refseq.ucsc" so that the error occurred. The right annotation file of hg38_refseq.ucsc should be downloaded via "Table Browser" in UCSC genome browser. The parameters for the download of hg38_refseq.ucsc in Table Browser were as following: clade-Mammal, genome-Human, assembly-hg38, group-mRNA and EST, track-human mRNAs, table-UCSC RefSeq (refGene). Finally get output. Hope my experience helps. Best wishes.

At 2024-05-23 23:42:35, "WangYuzhou1996" @.***> wrote:

Hi, I encountered the same problem, after running ROSE_main,I didn't get the SuperEnhancers.table.txt file, and got the error:IndexError: list index out of range. Did you solve the problem?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>