tleonardi / nanocompore

RNA modifications detection from Nanopore dRNA-Seq data
https://nanocompore.rna.rocks
GNU General Public License v3.0
80 stars 12 forks source link

IndexError during Eventalign_collapse #197

Closed Tomcxf closed 2 years ago

Tomcxf commented 2 years ago

Dear developer, Thanks for the donation for modification search! When I run eventalign_collapse after nanopolish evetalign, I meet an error nanocompore eventalign_collapse -t 100 -i ../nanopolish/\{eventalign_reads_tsv\} -o Ti1_eventalign_collapsed_reads.tsv After nearly 24 hours, it seems to end. But it gets error : 2022-03-15T09:33:52.604802+0000 ERROR - Process-1 | Traceback (most recent call last): 127 File "/home/ubuntu/anaconda3/envs/nanocompore/lib/python3.7/site-packages/nanocompore/Eventali 128 for l in sp: 129 File "/home/ubuntu/anaconda3/envs/nanocompore/lib/python3.7/site-packages/nanocompore/SuperPar 130 line = self._parse_line(line) 131 File "/home/ubuntu/anaconda3/envs/nanocompore/lib/python3.7/site-packages/nanocompore/SuperPar 132 line = [line[i] for i in self.select_idx] 133 File "/home/ubuntu/anaconda3/envs/nanocompore/lib/python3.7/site-packages/nanocompore/SuperPar 134 line = [line[i] for i in self.select_idx] 135 IndexError: list index out of range Is there any problem cause that ? At first I doubt it may be nanopolish index path error(not a absolute path). But when I change and do again ,it shows the same result. For the reason that the output is still available (~60G) , so I don‘t know whether it is just a warning or not. Thank you !

JeremyQuo commented 2 years ago

I modified the code and add a try function to skip the wrong index and obtain the final results. Hope I can help you.

def _parse_line (self, line):

    # Split line
    line = line.rstrip().split(self._sep)
    # Select field if needed
    try:
        if self.select_idx:
            line = [line[i] for i in self.select_idx]
    except Exception:
        raise SuperParserError("Cannot find the idx")
Tomcxf commented 2 years ago

I modified the code and add a try function to skip the wrong index and obtain the final results. Hope I can help you.

def _parse_line (self, line):

    # Split line
    line = line.rstrip().split(self._sep)
    # Select field if needed
    try:
        if self.select_idx:
            line = [line[i] for i in self.select_idx]
    except Exception:
        raise SuperParserError("Cannot find the idx")

Thanks for your help! I will try it later. By the way , when I deal with another files ,it works well. The different between two projects is that the error project's index is raw fast5 data ,while the "no error" project's index is good fast5 data. I don't know whether is the key. As you said your change is skip the wrong index , but I don't know whether this wrong can effect the result because the output seems normal and the final result also seems normal. May be I should do again and compare the two result. Thanks!

Tomcxf commented 2 years ago

I modified the code and add a try function to skip the wrong index and obtain the final results. Hope I can help you.

def _parse_line (self, line):

    # Split line
    line = line.rstrip().split(self._sep)
    # Select field if needed
    try:
        if self.select_idx:
            line = [line[i] for i in self.select_idx]
    except Exception:
        raise SuperParserError("Cannot find the idx")

It works! Thank you!

JeremyQuo commented 2 years ago

Actually, the method I uoloaded above is not the best way but the fastest way. On the other hand, you can check your skipped data and revised it one by one.

I think the effect of this modification will depend on the quantity of your skipped data. If there is a few lines that skipped, it will not influence your result. You can print the wrong line or set a counter to check it.

Tomcxf commented 2 years ago

Actually, the method I uoloaded above is not the best way but the fastest way. On the other hand, you can check your skipped data and revised it one by one.

I think the effect of this modification will depend on the quantity of your skipped data. If there is a few lines that skipped, it will not influence your result. You can print the wrong line or set a counter to check it.

Thanks! I will try!