stripe-archive / mosql

MongoDB → PostgreSQL streaming replication
MIT License
1.63k stars 225 forks source link

Reimport takes several hours #71

Open adborden opened 9 years ago

adborden commented 9 years ago

Great project, it's been super effective for us when exporting small amounts of data from mongo, but as we've added more data we're seeing it take several hours rather than several minutes. Has anyone hit a similar wall? It is a substantial amount of data, but maybe we're doing something wrong?

Fields Objects
3 42K
8 1.4M
5 18K
10 180K
4 1.5K
5 13K
nelhage commented 9 years ago

If you run an import with -v, it should show some basic stats about where the time is spent -- in SQL vs. reading/transforming the data. What does that show? Approximately what rate of import (documents/s) are you getting?

ezesculli commented 8 years ago

I saw performance problems when I had big latency between MongoDB & PostGre databases. I tried different configurations on the cloud, using different MongoDb providers (MongoDB, EC2 instance, etc.) & PostGre providers (Amazon RDS, Heroku PostGre, etc.) to had the better performance. Hosting everything on EC2 was the best solution!

Hope it helps!