In the previous code, the chunksize variable always come from the current position minus the previous position. In each tuple, the first element is the current position, whereas the second element is the previous chunk size.
This means that there will always be a mismatch in each tuple. Only fixing up the first chunk size will not work correctly, unless all chunks have the same size. In which case it is simply updating the first 0 to be the actual chunk size.
self.pos = []
prev_pos = 0
# Iterate through sequence to get frame offsets
for match in self._fff_it:
index = match.start()
chunksize = index-prev_pos
self.pos.append((index, chunksize))
prev_pos = index
# Fix up the first chunk size
if len(self.pos) > 1:
self.pos[0] = (0, self.pos[1][1])
elif len(self.pos) == 1:
self.pos[0] = (0, len(self.seq_blob))
The updated code should solve this issue.
start_pos = [match.start() for match in self._fff_it]
fff_offset = start_pos[:-1]
fff_size = start_pos[1:] - start_pos[:-1]
self.pos = [(fff_offset[idx], fff_size[idx]) for idx in range(len(fff_offset))]
In the previous code, the
chunksize
variable always come from the current position minus the previous position. In each tuple, the first element is the current position, whereas the second element is the previous chunk size.This means that there will always be a mismatch in each tuple. Only fixing up the first chunk size will not work correctly, unless all chunks have the same size. In which case it is simply updating the first 0 to be the actual chunk size.
The updated code should solve this issue.