google / deepconsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
BSD 3-Clause "New" or "Revised" License
222 stars 37 forks source link

Alignment loss function #30

Closed marcpaga closed 2 years ago

marcpaga commented 2 years ago

Thanks for the very interesting work.

I was wondering about the alignment loss used to train the model. It is clear that indels can shift the whole predicted sequence and then a loss like cross-entropy explodes by small mistakes. I thought a CTC loss would work in this scenario, but you developed a new alignment loss for this task. I was wondering if you could elaborate on why this alignment loss is needed or why CTC is not viable here.

kishwarshafin commented 2 years ago

Hi @marcpaga ,

Thanks for opening this issue. There are several things to consider here:

Finally, we have not exhaustively looked at all the loss functions that are available. So, future experiments with other alignment-based loss functions including CTC may give us observations to help answer your question more accurately.

marcpaga commented 2 years ago

Hi @kishwarshafin,

Thanks for the quick and clear response! I think that a performance comparison with CTC and other losses would be very interesting for scenarios where both could be used. Not only in terms of speed or compute requirements, but also accuracy. Furthermore, your alignment loss can be used in models where CTC is not possible to be used, like seq2seq models, so thanks for your contribution!