huggingface / optimum-habana

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
Apache License 2.0
153 stars 202 forks source link

Where in the directory "/tmp/tst-summarization", is the summarization output stored? #292

Closed Abhaycnvrg closed 1 year ago

Abhaycnvrg commented 1 year ago

System Info

Optimum Habana : 1.6.0
SynapseAI : 1.10.0
Docker Image : Habana® Deep Learning Base AMI (Ubuntu 20.04)
Volume : 1000 GiB

Information

Tasks

Reproduction

Start an EC2 instance with DL1 Resource and this image : Habana® Deep Learning Base AMI (Ubuntu 20.04) Run these commands a. docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.10.0/ubuntu20.04/habanalabs/pytorch-installer-2.0.1:latest b. git clone https://github.com/huggingface/optimum-habana.git c. pip install optimum[habana] d. cd examples e. cd summarization f. pip install -r requirements.txt

python run_summarization.py \ --model_name_or_path t5-small \ --do_eval \ --dataset_name cnn_dailymail \ --dataset_config "3.0.0" \ --source_prefix "summarize: " \ --output_dir /tmp/tst-summarization \ --per_device_train_batch_size 4 \ --per_device_eval_batch_size 4 \ --overwrite_output_dir \ --predict_with_generate \ --use_habana \ --use_lazy_mode \ --use_hpu_graphs_for_inference \ --gaudi_config_name Habana/t5 \ --ignore_pad_token_for_loss False \ --pad_to_max_length \ --save_strategy epoch \ --throughput_warmup_steps 3

Expected behavior

Need a file with the summarized text and not just the evaluation metrics

regisss commented 1 year ago

Could you try using --do_predict instead of --do_eval? With --do_predict, results are processed and written here: https://github.com/huggingface/optimum-habana/blob/32f8555b543afd696064e8a56979606880f17995/examples/summarization/run_summarization.py#L744

--do_eval and --do_predict almost do the same, except that --do_eval is usually applied on the validation set to check your metrics and --do_predict is used on the test set to generate the intended results.

Abhaycnvrg commented 1 year ago

Thanks! That worked An unrelated question

What is the difference between a container image that we create while creating an instance and the container image we use in this command here after we connect to the instance using ssh docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.10.0/ubuntu20.04/habanalabs/pytorch-installer-2.0.1:latest

Are they the same...the first is creation while the second is going "inside" it to run programs?

regisss commented 1 year ago

The AMI you used when launching your instance is an image used by AWS to set up the virtual machine running on the hardware you chose. Here is the official AWS doc about AMIs: https://docs.aws.amazon.com/en_us/AWSEC2/latest/UserGuide/AMIs.html

On the other hand, a Docker image contains all the dependencies you need to run your code (because they may not be installed on your AWS instance or on your laptop). And then, you can run a container relying on this image to actually run your code (that's the environment you enter after running docker run ...). Here is a good summary about Docker images and containers: https://circleci.com/blog/docker-image-vs-container/