blaze / odo

Data Migration for the Blaze Project
http://odo.readthedocs.org/
BSD 3-Clause "New" or "Revised" License
1k stars 138 forks source link

Problems with copying csv files to a remote postgresql database #550

Open nanounanue opened 7 years ago

nanounanue commented 7 years ago

Hi

I have a csv file in my local disk: data/myfile.csv, and a postgresql database in another machine.

When I try to run

odo("data/myfile.csv", "postgresql://user:password@some_ip:54322::tbl"), where user, password, some_ip and tbl have the right values, I got the following:

...
OperationalError: (psycopg2.OperationalError) could not open file "/home/nanounanue/proyectos/test-project/data/myfile.csv" for reading: No such file or directory
 [SQL:  '\n            COPY trips FROM %(path)s ...

I noted two things with this error:

  1. First odo is converting my relative path to an absolute path using (I guess) the information of my current directory (maybe with os?)

  2. odo is using COPY instead of \copy this means that the file must be in the server disk

So I copied the file to the server (to the directory /tmp/raw-data/), and I tried again:

odo("/tmp/raw-data/myfile.csv", "postgresql://user:password@some_ip:54322::tbl") and I got

....
FileNotFoundError: [Errno 2] No such file or directory: '/raw-data/2013-08-Citi-Bike-trip-data.csv'

(NOTE that my raw-data doesn' t exist in my local machine)

This not work either.

Then, finally I created in my local machine the directory /tmp/raw-data and copied the file there. Then everything works. So, my guess is that odo is mixing the paths, I mean, it is creating the path from the local machine, even when he is trying to load from the remote machine...

This smells like a bug...

tomrg commented 7 years ago

Is there a way to force it to use \copy or even better, make it an option?

josh-gree commented 5 years ago

Any ideas on this?

I would like to use odo to copy local CSV to a dockerized postgres. I have a host directory mounted to the container but this has different paths internally and externally. I get same errors as the OP.

psychemedia commented 5 years ago

@josh-gree I have a similar issue, trying to run a script in a Jupyter notebook that has datafiles local to that container load data into docker-compose connected db container.

Workaround for now is to mount the data files into the db container.