Closed chassing closed 4 years ago
Oh well, didn't want to do it on my laptop but I think I got all of the issues now. Running a postgres test import of the current datasets at the moment. This will take a while though...
Apparently some porn producer decided to paste the description as alternative title, resulting in a length of 831 (tconst=tt9109230).
The good news: transfer with postgres works now. The fixed version is on PyPI.
The bad news is the performance, especially for title.principals.tsv.gz
. On my oldish 2016 MacBook Pro it eventually says:
INFO:pimdb: added 38756173 rows in 493:50.602 (1307.98 rows per second)
That's more than 8 hours... 😞
If you really want to go through this, remember: unlike with SQLite with Postgres you can run multiple "pimdb transfer" in parallel, each time specifying a different dataset. Specifying "pimdb transfer all" transfers all with on command but does so sequentially, which takes a few hours more for the whole process to finish.
similar issue to #13 but this time on TitleAkas: