pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.5k stars 815 forks source link

yield of yield ? #1363

Open Dongximing opened 3 years ago

Dongximing commented 3 years ago

❓ Questions and Help

Description

Hi, I saw the document of torchText, function 'ngrams_iterator' has become a yield, when you call 'build_vocab_from_iterator ' you still have 'yield ngrams_iterator' that become 'yield of yield '? it that ok? thanks

parmeet commented 3 years ago

Hi @Dongximing , could you please provide additional details, for example using sample code snippets? How are you using 'build_vocab_from_iterator' together with 'ngrams_iterator'? Also would be great if you can post any error encountered. Thanks!

Dongximing commented 3 years ago
def yield_tokens(data_iter, ngrams):
    for _, text in data_iter:
        yield ngrams_iterator(tokenizer(text), ngrams)

I just wondering that 'ngrams_iterator' has become an iterator, but 'yield' also make 'ngrams_iterator(tokenizer(text), ngrams)' become a iterator again. That is why I called "yield of yield " when I add 'list function'-> list(ngrams_iterator(tokenizer(text), ngrams)) is the same result as without list.
so I mean 'ngrams_iterator' has become an iterator, should we make it to be list and then use yield to make them become a iterator ?

@parmeet