Yoctol / strpipe

text preprocessing pipeline
Other
5 stars 0 forks source link

Fix mutable pad #62

Closed stegben closed 5 years ago

stegben commented 5 years ago

The original implementation causes a mutable bug, which will affect the intermediates:

import strpipe as sp

p = sp.Pipe()
p.add_step_by_op_name('AddSosEos')

data = [
    ['a', 'b', 'c'],
    ['d', 'e'],
]

result, _, _ = p.transform(data)
assert result == [['<SOS>', 'a', 'b', 'c', '<EOS>'], ['<SOS>', 'd', 'e', '<EOS>']]

p.add_checkpoint()
p.add_step_by_op_name('Pad')
p.fit(data)
_, _, intermediates = p.transform(data)
assert intermediates == [['<SOS>', 'a', 'b', 'c', '<EOS>'], ['<SOS>', 'd', 'e', '<EOS>', '<PAD>']]
# Wrong!