prabahar87 / automation

0 stars 0 forks source link

CSV 45mb each takes forever to accomplish #1

Open Gaetano-Dati opened 2 weeks ago

Gaetano-Dati commented 2 weeks ago

Hi, I have 2 CSV files to compare. But each takes more or less 45mb. I'm not able to understand if it's going out of memory or just taking a long time. I would add some feedback about that. Thank you

prabahar87 commented 1 week ago

!/bin/bash

original1="original1.csv" original2="original2.csv"

matched="matched.csv" unmatched="unmatched.csv"

"$matched" "$unmatched"

declare -A original2_lines

echo "Loading $original2 into memory..." while IFS= read -r line2; do line2_lower=$(echo "$line2" | tr '[:upper:]' '[:lower:]') # Convert to lowercase original2_lines["$line2_lower"]=1 done < "$original2" echo "Finished loading $original2."

line_number=0 echo "Comparing lines from $original1..." while IFS= read -r line1; do line1_lower=$(echo "$line1" | tr '[:upper:]' '[:lower:]') # Convert to lowercase ((line_number++))

if [ $((line_number % 1000)) -eq 0 ]; then echo "Processed $line_number lines from $original1..." fi

if [[ -n ${original2_lines["$line1_lower"]} ]]; then echo "$line1" >> "$matched" else echo "$line1" >> "$unmatched" fi done < "$original1"

echo "Comparison completed (case-insensitive)." Can you try this one