a lot of versions seem to be created in the hub with each submit --
is this desirable
Well.. you should only submit when you're ready. Hence the difference between 'Save' and 'Submit'.
Easy solution: change 'submit' to 'publish' with an extra dialog that asks for additional metadata. (Note: this may not be the best place to ask for this metadata, but we need a dialog for dataset-global metadata such as original owner/region etc.)
AR: Yes, I think the word "publish (dataset)" would already prevent this from happening. Part of it is also demo-specific behaviour from the user. Another option is to have some sort of progress indication, e.g. colouring some of the variables on the left of the screen if you've linked them (would obviate the need for checking the hub for progress). Finally, we could also check the hub for multiple datasets made by the same user only minor differences.