vzhd1701 / csv2notion

Import/Merge CSV files into Notion database
MIT License
75 stars 10 forks source link

[Bug]: Merging the same CSV multiple times results with some (random) duplicate rows #41

Open bumper314 opened 1 week ago

bumper314 commented 1 week ago

csv2notion version

0.3.9

What OS are you using?

MacOS

OS Version / Linux distribution

macOS 10.14, Python 3.12.5

Bug description

  1. Start with an empty database
  2. Import the CSV file with csv2notion --token "$token" --url "$url" --merge "youtube2csv_spanishafterhours.csv"
  3. Run the command again (which shouldn't change anything) results in a few duplicate rows at the bottom. Run again and you'll get more duplicate rows, but not necessarily the same as the second run. It's kinda random.

I can't find any obvious reason for the duplicate rows. The Titles (key column) are mostly ASCII, but some contain strange Unicode like zero width joiner and some emoji.

Example CSV file to demonstrate the issue: youtube2csv_spanishafterhours.csv

Log excerpt

Nothing helpful in the log, even with --verbose

2024-09-17 13:23:55,333 [INFO    ] Validating CSV & Notion DB schema
2024-09-17 13:23:55,653 [INFO    ] Uploading youtube2csv_spanishafterhours.csv...
2024-09-17 13:24:00,856 [INFO    ] Done!
bumper314 commented 1 week ago

Here are the titles for the rows that get duplicated:

Apart from the last line, the first two lines are pure ASCII, so I don't think this is a Unicode issue. I also tried normalizing the .csv to all 4 Unicode normalization forms, but I still get the duplicate lines on import.

BTW, --verbose combined with --fail-on-duplicates should output the duplicates to make it easier for people to find issues.