Yoctol / strpipe

text preprocessing pipeline
Other
5 stars 0 forks source link

annotated normalizer #45

Open SoluMilken opened 5 years ago

SoluMilken commented 5 years ago

tx_map = {
    start_pos(int): {
        'end': end_pos(int),
        'orig': original_segment,
        'rep': replacement, 
    },
}

meta = {
    start_pos(int): {
        'end': end_pos(int),
        'org_start': original_start_pos(int),
        'org_end': original_end_pos(int),
    }
}
class Doc:
    def __init__(self, input_str: str):
        self.original_str = input_str
        self.
    def transform(self, tx_map) -> (transformed str, inv_tx_map):

    def inverse_transform_label(self, pred, inv_tx_map) -> bd_pred:
          output = []
          i = 0
          while(i < len(pred)):
            if i not in inv_tx_map:
                output.append(p)
            else:
                diff = inv_tx_map[i]['org_end'] - inv_tx_map[i]['org_start']
                o = [pred[i]] * diff
             i = inv_tx_map[i]['end']
stegben commented 5 years ago

What is this? Why should we add it? Could you provide any examples?