pepkit / pepdbagent

Database for storing sample metadata
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

`PEPAgent` upload issues #7

Closed nleroy917 closed 1 year ago

nleroy917 commented 2 years ago

Just had a couple of issues uploading projects to a database. I thought I would distill it all here:

I wrote an upload script to upload a folder of PEP's into a database. I just pulled the official postgres docker image and started running it like so:

docker run -p 5432:5432 -e POSTGRES_PASSWORD=admin postgres

Then I run the upload script with:

python scripts/load_db.py examples  

It seems to sort of work with two problems I've run into:

  1. The PEPAgent doesn't create the table if it doesn't already exist. I had to go into pgadmin and manually create the schema. It'd be cool if PEPAgent could recognize this and just create it for you with the correct data types, names, etc. relation "projects" does not exist LINE 1: INSERT INTO projects(project_name, project_value, description...

  2. The digests are all the same, and the id column is null. This might be an artifact from me creating the table manually with incorrect data types, but I wasn't sure.

image

I'm less experienced in db-administration/connection tools, so maybe this isn't a good idea. @nsheff ?

khoroshevskyi commented 2 years ago

1) Issue: Why does it have create table, if we are connecting to database, which already should have this table? 2) hmm, I don't know why does it happend like that. in the initial sql file we are creating this db using this lines:id BIGSERIAL NOT NULL PRIMARY KEY, . If it's primary key it should create key

nleroy917 commented 2 years ago
  1. Issue: Why does it have create table, if we are connecting to database, which already should have this table?

If someone starts a fresh postgres instance, then they don't have to worry about table creation (pulling a docker image, or spinning one up on AWS). All the user has to worry about is deploying an instance and pepdb handles the schema. @nsheff can weigh in here, if that's within the scope of this package.

2. hmm, I don't know why does it happend like that. in the initial sql file we are creating this db using this lines:id BIGSERIAL NOT NULL PRIMARY KEY, . If it's primary key it should create key

I'll wait for this to be more polished then come back to the digest + id

nleroy917 commented 2 years ago

Ok. I see now with the pep_db folder that has the docker image + schema file. That makes sense... then we might not need to address the issue with point 1 I made.

nsheff commented 1 year ago
  1. @Khoroshevskyi Make sure you at least add instructions for how to initialize the table in the top-level README.md
khoroshevskyi commented 1 year ago

I have added to top-level README this information. I think this issue can be closed