AmpliconSuite / CoRAL

Reconstruction of focal amplifications with long reads
BSD 3-Clause "New" or "Revised" License
12 stars 3 forks source link

add_discordant_edge error when ruuning reconstruct #14

Open is01 opened 1 week ago

is01 commented 1 week ago

Hello, I am running reconstruct using Nanopore long read data and getting the following error: It works fine for most data, but I got the error with certain data.

Traceback (most recent call last):
  File "~/CoRAL/src/CoRAL.py", line 145, in <module>
    reconstruct_mode(args)
  File "~/CoRAL/src/CoRAL.py", line 23, in reconstruct_mode
    infer_breakpoint_graph.reconstruct(args)
  File "~/CoRAL/src/infer_breakpoint_graph.py", line 1656, in reconstruct
    b2bn.build_graph()
  File "~/CoRAL/src/infer_breakpoint_graph.py", line 1037, in build_graph
    self.lr_graph[amplicon_idx].add_discordant_edge(bp[0], bp[1], bp[2], bp[3], bp[4], bp[5], lr_count = len(bp[-1]), reads = bp[-1])
  File "~/CoRAL/src/breakpoint_graph.py", line 199, in add_discordant_edge
    raise Exception("Breakpoint node must be added first.")
Exception: Breakpoint node must be added first.

Below is the script:

## Python v3.10.14
## CoRAL v1.0.0

./CoRAL/src/CoRAL.py reconstruct \
--lr_bam ONT.bam \
--cnv_seed CN6_SEEDS.bed \
--output_prefix sample \
--cn_seg cncf.bed \
--log_fn sample \

I noticed that in the build_graph function, there was a case where 2nd bp in self.new_bp_list was included in split_int and the 1st bp was not included, and this seemed to be the cause of the error.

It seems that the error can be avoided by changing the script as follows. However, I am worried that this may affect the analysis results. Is anyone else dealing with this kind of issue?

Here is edited code: src/infer_breakpoint_graph.py.

    def build_graph(self):
        """
        Organize the identified discordant edges into a list of breakpoint graphs, stored in lr_graph
        Each graph represent a connected component of amplified intervals, i.e., amplicon
        """
        # Split amplified intervals according to discordant edges
        split_int = dict()
        for bpi in range(len(self.new_bp_list)):
            bp = self.new_bp_list[bpi]
            for ai in range(len(self.amplicon_intervals)):  
                seg = self.amplicon_intervals[ai]
                if bp[0] == seg[0] and seg[1] < bp[1] < seg[2]:
                    if bp[2] == '+':
                        try:
                            split_int[ai].append((bp[1], bp[1] + 1, bpi, 1, '+'))
                        except:
                            split_int[ai] = [(bp[1], bp[1] + 1, bpi, 1, '+')]
                    if bp[2] == '-':
                        try:
                            split_int[ai].append((bp[1] - 1, bp[1], bpi, 1, '-'))
                        except:
                            split_int[ai] = [(bp[1] - 1, bp[1], bpi, 1, '-')]
                if bp[3] == seg[0] and seg[1] < bp[4] < seg[2]:
                    if bp[5] == '+':
                        try:
                            split_int[ai].append((bp[4], bp[4] + 1, bpi, 4, '+'))
                        except:
                            split_int[ai] = [(bp[4], bp[4] + 1, bpi, 4, '+')]
                    if bp[5] == '-':
                        try:
                            split_int[ai].append((bp[4] - 1, bp[4], bpi, 4, '-'))
                        except:
                            split_int[ai] = [(bp[4] - 1, bp[4], bpi, 4, '-')]
                    if bp[2] == '+' and bp[0] == seg[0] and (bp[1] < seg[1] or seg[2] > bp[1]):
                        try:
                            split_int[ai].append((bp[1], bp[1] + 1, bpi, 1, '+'))
                        except:
                            split_int[ai] = [(bp[1], bp[1] + 1, bpi, 1, '+')]
                    if bp[2] == '-' and bp[0] == seg[0] and (bp[1] < seg[1] or seg[2] > bp[1]):
                        try:
                            split_int[ai].append((bp[1] - 1, bp[1], bpi, 1, '-'))
                        except:
                            split_int[ai] = [(bp[1] - 1, bp[1], bpi, 1, '-')]

Thanks

is01 commented 1 week ago

It appears that the error can be avoided by simply commenting out the Exception and the graph construction completed.

src/breakpoint_graph.py

    def add_discordant_edge(self, chr1, pos1, o1, chr2, pos2, o2, sr_count = -1, sr_flag = 'd', \
                sr_cn = 0.0, lr_count = -1, reads = set([]), cn = 0.0):
        """
        Add a discordant edge to the graph.
        """
        if (chr1, pos1, o1) not in self.nodes or (chr2, pos2, o2) not in self.nodes:
            logging.debug("#TIME " + '%.4f\t' %(time.time() - global_names.TSTART) + "### Exception Breakpoint node must be added first. Continue::" + str(chr1) + ":" + str(pos1) + " " + str(o1) + "  " + str(chr2) + ":" + str(pos2) + " " + str(o2) )
            #raise Exception("Breakpoint node must be added first.")
        else:
            ld = len(self.discordant_edges)
            self.nodes[(chr1, pos1, o1)][2].append(ld)
            self.nodes[(chr2, pos2, o2)][2].append(ld)
            if (chr1, pos1, o1) in self.endnodes:
                self.endnodes[(chr1, pos1, o1)].append(ld)
            if (chr2, pos2, o2) in self.endnodes:
                self.endnodes[(chr2, pos2, o2)].append(ld)
            self.discordant_edges.append([chr1, pos1, o1, chr2, pos2, o2, sr_count, sr_flag, sr_cn, lr_count, reads, cn])