#169: fixed error when running on cpu and added post install command to upgrade transformers

addressing issue: #169

changelog:

reading README.md file in setup.py with utf-8 encoding as some systems may use different default text encoding
added checking for device type to ensure only cuda supported devices will use torch.cuda.Stream() and pin_memory(). this will resolve errors that occur during inferencing on cpu only devices.
added post install command to upgrade transformers to latest version to ensure "ValueError: rope_scaling must be a dictionary with two fields" error does not occur.
added the support for non sharded models on huggingface repo which are just single model files like 'model.safetensors'.

lyogavin / airllm