cga-harvard / Data_Science_Big_Data_Projects

Repository for FASRC projects
MIT License
8 stars 3 forks source link

Error: pg_restore draws error due to non-existant 'dkakkar' role #4

Open jakerbrown opened 4 years ago

jakerbrown commented 4 years ago

[jbrown613@holy7c12307 partisan]$ time pg_restore -j 8 -h localhost -p 9509 -d partisandb /n/holyscratch01/enos_lab/jbrown613/partisan/usvotersdbcompressed.pgsql

pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 197; 1259 16385 TABLE partisan dkakkar pg_restore: [archiver (db)] could not execute query: ERROR: role "dkakkar" does not exist Command was: ALTER TABLE public.partisan OWNER TO dkakkar;

dkakkar commented 4 years ago

[jbrown613@holy7c12307 partisan]$ time pg_restore -j 8 -h localhost -p 9509 -d partisandb /n/holyscratch01/enos_lab/jbrown613/partisan/usvotersdbcompressed.pgsql

pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 197; 1259 16385 TABLE partisan dkakkar pg_restore: [archiver (db)] could not execute query: ERROR: role "dkakkar" does not exist Command was: ALTER TABLE public.partisan OWNER TO dkakkar;

The default dump also dumped the grants/privileges for the table. I will create a new dump without grants.

dkakkar commented 4 years ago

[jbrown613@holy7c12307 partisan]$ time pg_restore -j 8 -h localhost -p 9509 -d partisandb /n/holyscratch01/enos_lab/jbrown613/partisan/usvotersdbcompressed.pgsql

pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 197; 1259 16385 TABLE partisan dkakkar pg_restore: [archiver (db)] could not execute query: ERROR: role "dkakkar" does not exist Command was: ALTER TABLE public.partisan OWNER TO dkakkar;

Could you try to restore /n/holyscratch01/enos_lab/dkakkar/partisan/usvotersdbnew.pgsql instead?

jakerbrown commented 4 years ago

The code that drew that message just finished running. You want me to run:

time pg_restore -j 8 -h localhost -p 9509 -d partisandb /n/holyscratch01/enos_lab/dkakkar/partisan/usvotersdbnew.pgsql

Correct?

dkakkar commented 4 years ago

The code that drew that message just finished running. You want me to run:

time pg_restore -j 8 -h localhost -p 9509 -d partisandb /n/holyscratch01/enos_lab/dkakkar/partisan/usvotersdbnew.pgsql

Correct?

Did the code run without error? Are you able to connect to DB?

jakerbrown commented 4 years ago

When I ran the above code, rather than what is in the instructions, I got the following error message:

[jbrown613@holy7c04204 ~]$ time pg_restore -j 8 -h localhost -p 9972 -d usvotersdb /n/holyscratch01/enos_lab/dkakkar/partisan/usvotersdbcompressed.pgsql pg_restore: [archiver] could not open input file "/n/holyscratch01/enos_lab/dkakkar/partisan/usvotersdbcompressed.pgsql": No such file or directory

real 0m0.294s user 0m0.001s sys 0m0.005s

dkakkar commented 4 years ago

When I ran the above code, rather than what is in the instructions, I got the following error message:

[jbrown613@holy7c04204 ~]$ time pg_restore -j 8 -h localhost -p 9972 -d usvotersdb /n/holyscratch01/enos_lab/dkakkar/partisan/usvotersdbcompressed.pgsql pg_restore: [archiver] could not open input file "/n/holyscratch01/enos_lab/dkakkar/partisan/usvotersdbcompressed.pgsql": No such file or directory

real 0m0.294s user 0m0.001s sys 0m0.005s

There is no such file in the directory. It was a file I created for my reference. Please create your own database dump by following the instructions here for each year you would like to run:

https://github.com/cga-harvard/GIS_Apps_on_HPC/wiki/Loading-and-Exporting-US-voters-database

Then follow the instructions here to run KNN on that year (statewise):

https://github.com/cga-harvard/GIS_Apps_on_HPC/wiki/KNN-calculation

As there is a memory constraint (256GB maximum) on GPU memory on FASRC so you would have to run each year separately and state-wise. For bigger states you would have to divide in several chunks. Please start with the smallest state and go to largest to learn the size limit of what FASRC memory could hold. Then for bigger states you would have to divide in that smaller size chunks and run the script for each chunk.

jakerbrown commented 4 years ago

Thank you, does this mean that there will be edge cases, where voters who live close to state boundaries, or to the boundaries of the chunks I create in bigger states, have a distorted partisan footprint, because any neighbors not in the chunk cannot be detected?

dkakkar commented 4 years ago

Thank you, does this mean that there will be edge cases, where voters who live close to state boundaries, or to the boundaries of the chunks I create in bigger states, have a distorted partisan footprint, because any neighbors not in the chunk cannot be detected?

No, that's not how the script works. When you select a state you only select what voters you would like to find the neighbors for but it will still look for neighbors of those voters in the whole US database to avoid edge case problem. Additionally, it uses Geohash and R-tree search to speed up the neighbor search in the entire database. Pls refer to my report here to learn details: https://drive.google.com/file/d/12zAdVBoCTTRmDMQ3-nOQSNdjBAyZzh1r/view?usp=sharing

dkakkar commented 4 years ago

I have created a new data dump with the 2012 file. When I now go back and get to the step where I run:

time pg_restore -j 8 -h localhost -p 10979 -d usvotersdb /n/holyscratch01/enos_lab/jbrown613/partisan/usvotersdbcompressed.pgsql

I get a message that the role 'dkakkar' does not exist. I get the same message if I instead run:

time pg_restore -j 8 -h localhost -p 10979 -d usvotersdb /n/holyscratch01/enos_lab/dkakkar/partisan/usvotersdbfull.pgsql

What command did you use to export the dump file?

jakerbrown commented 4 years ago

My mistake. Looking back my FASRC job timed out before I was able to export. I will re-run all that and update.

Pls use:

pg_dump -O -x -h localhost -p $port -Fc $databasename > /n/holyscratch01/enos_lab/$user/partisan/$databasename.pgsql