[Atlas]Added torchrun support

This PR:

Adds torchrun support to run the finetune_qa.py in a user-friendly way. In a machine with 8 GPUs, we can run:

torchrun --standalone --nnodes 1 --nproc_per_node 8 finetune_qa.py --train_data $DATA_DIR/nq_data/train.100-shot.jsonl --eval_data $DATA_DIR/nq_data/test.jsonl --name "my_finetuning_experiment" --checkpoint_dir $DATA_DIR/experiments/ --total_steps 31 --index_mode faiss --faiss_index_type pq --faiss_code_size 16 --model_path $DATA_DIR/models/atlas/base --load_index_path $DATA_DIR/indices/atlas/wiki/base --reader_model_type google/t5-base-lm-adapt

It fixes an issue with the _add_embeddings_by_chunks() method that was not including all embeddings since it assumed that the range interval when slicing was [start,end] but it is [start,end) , e.g. it excludes the last value.
It adds a missing config for the GpuIndexFlatIP and its corresponding type.

facebookresearch / atlas

[Atlas]Added torchrun support #4