Closed enpassanty closed 1 year ago
I'm getting a "dataset is required" error with this command:
./getalltokens -capcode true -charset UTF-8 -chunk-size 100000 -dataset /Users/me/wikitext-103-raw/wikitest.raw -max-token-length 1000 -micro-chunks 10 -min-occur 30 -min-occur-chunk 3 -output string enwiki_dict.txt -workers 1
The command line flags are parsed using Go standard library, so it should work. Try putting the filepath in "quotes"
I'm getting a "dataset is required" error with this command:
./getalltokens -capcode true -charset UTF-8 -chunk-size 100000 -dataset /Users/me/wikitext-103-raw/wikitest.raw -max-token-length 1000 -micro-chunks 10 -min-occur 30 -min-occur-chunk 3 -output string enwiki_dict.txt -workers 1