Right now, the skiplist is hardcoded (as in original ColBERT, masking all the punctuation in documents for scoring).
It might be cool to be able to let the user define it, so we need to choose how to present the option and in what format to store it.
Right now, the skiplist is hardcoded (as in original ColBERT, masking all the punctuation in documents for scoring). It might be cool to be able to let the user define it, so we need to choose how to present the option and in what format to store it.