backstopmedia / tensorflowbook

459 stars 297 forks source link

The iterative batch generation code is so strange to me in Chapter 6 04_arxiv preprocessing.py. #13

Open jeffacode opened 7 years ago

jeffacode commented 7 years ago

First, for i in range(0, len(text) - self.length + 1, self.max_length // 2):. I'm sorry, but what if len(text) is actually smaller than self.length(I assume it's the max_length)? And Why would I need to do this process?

Second, assert all(len(x) == len(windows[0]) for x in windows). Why do I need to make every text the same length?

Next, the following while True. Isn't it going to loop infinitely?

Last, batch = windows[i: i + self.batch_size]. I don't think last batch generated will be the same size as previous ones in first dimension.

Hope someone could answer my questions:)