GGUF fp32/fp16 conversion to checkpoint

Summary:

Only works for fp32 and fp16 types so that means it isn't providing much value right now. convert_hf_checkpoint.py can already directly generate an equivalent .pth checkpoint file without gguf format indirection. However this PR just creates the foundation and validation that the basic fp32 and fp16 works fine. In the future, we will support running the quantized version of the gguf graph in eager.

Test Plan:

Setup pip install gguf git clone git@github.com:ggerganov/llama.cpp.git python scripts/download.py --repo_id [HF-dir]
Preparation: convert existing hf model to fp16 python llama.cpp/convert.py [HF-dir] --outtype f16`` which will generate [HF-dir]/ggml-model-f16.gguf
Convert GGUF file to a checkpoint python scripts/convert_from_gguf.py --gguf_file [HF-dir]/ggml-model-f16.gguf --checkpoint_file [HF-dir]/model_gguf.pth
Validate that it works: python generate.py --checkpoint_path [HF-dir]/model_gguf.pth --device=cpu --prompt "Hello, my name is" --max_new_tokens 20

pytorch-labs / gpt-fast

GGUF fp32/fp16 conversion to checkpoint #134