Better documentation - Githubissues

This PR improves the documentation and overall structure of the weight conversion utilities available. Main features:

[x] Updated the list of new features (to mention RoPE scaling, push_to_hub, instruction tuning, metrics support, and codellama).
[x] Added (a brief) push_to_hub documentation, solving #39.
[x] Removed convert_llama2hf.py, as it is officially provided by huggingface for llama v1, and both llama v1 and llama v2 weights are provided by huggingface (decapoda-research/llama-7b-hf or meta-llama/Llama-2-7b-hf).
[x] Renamed the weights2megatron -> weights_conversion utilities to make it clearer to use; and moved the utilities in that directory to weights_conversion/utils/ to make it easier to see which scripts are the most important to the user. Updated the instances of weights2megatron files to the new names (e.g. in the examples/ directory and in the tests/test_llama_weights).
[x] Removed old tokenization-utils/ directory and weights2megatron/README.md file, as most information there can be found in the docs/. Added to docs/ the information that was not yet present.
[x] Added docstrings to the weights_conversion utilities, and more information about those files to the docs/.

epfLLM / Megatron-LLM