Step 1 - specify exact file that needs to be downloaded from Wikipedia Clickstream data, or modify the tail command to be more generic (it specifies '2017_01_en_clickstream.tsv').
Step 2 - provide more detail on why I would want to create a local kafka service. Is this related to downloading Apache Kafka earlier? If I want to use the IBM Message Hub, how would I do that? Is there a fee involved? If so, maybe we shouldn't include this as an option?
Step 3 - "a handy command line utility for this purpose" - what purpose is that?
Step 3 - command has an "ip:port". What should I use here?
Run the script:
commands should be single lineso that user can cut/paste in scala shell.
Setup clickstream
Run the script: