Memory Consumption Increases During Dataset Iteration

To address the issue of increasing memory usage, I optimized the loading of image data as follows:

Extracting Image Files: I first extracted the images.tar.gz file to ensure all image files are available. After extraction, the image files are stored in a specified directory.
```
tar -xzvf images.tar.gz -C <image_directory>
```
Loading Data: I used pandas to load the Parquet file(ref-l4-test.parquet, ref-l4-val.parquet) containing the image information.
```
df = pd.read_parquet('<parquet_file_path>')
print(df.size)
```
Iterating Over the DataFrame : By iterating over each row of the DataFrame, I extracted relevant image information, including id, file_name, and caption.
```
for index, row in df.iterrows():
  info = row.to_dict()
  id = info['id']
  file_name = info['file_name']
  caption = info['caption']
  image_path = "<image_directory>/" + file_name
  image_source, image = load_image(image_path)
```
By implementing this approach, I successfully solve this problem and memory becomes stable

JierunChen / Ref-L4