pepkit / pepdbagent

Database for storing sample metadata
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

`KeyError: '_sample_df'` when attempting to upload a project #31

Closed nleroy917 closed 1 year ago

nleroy917 commented 1 year ago

I was attempting to load a new database with new PEPs that conforms to the latest schema (tags) and I received the following error:

line 799, in _create_digest
    json.dumps(project_dict["_sample_df"], sort_keys=True).encode("utf-8")
KeyError: '_sample_df'

Running isolated in the REPL:

>>> import peppy
>>> from pepdbagent import Connection
>>> pepagent = Connection(
...     user="postgres",
...     password="docker",
...     database="pep-base-sql"
... )
>>> proj = peppy.Project("examples/demo/basic/project_config.yaml")
>>> pepagent.upload_project(proj, namespace="demo", name="basic")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".venv/lib/python3.9/site-packages/pepdbagent/pepdbagent.py", line 101, in upload_project
    proj_digest = self._create_digest(proj_dict)
  File ".venv/lib/python3.9/site-packages/pepdbagent/pepdbagent.py", line 799, in _create_digest
    json.dumps(project_dict["_sample_df"], sort_keys=True).encode("utf-8")
KeyError: '_sample_df'
>>> 
nleroy917 commented 1 year ago

I think its root cause is peppy occurring right here: https://github.com/pepkit/peppy/blob/c7ecbd3e0d5ceb0c3561ff290cac8c58c634d478/peppy/project.py#L300

In p_dict, _sample_df is never set. The closest thing is _sample_dict.

Locally, changing pepdbagent.Connection._create_digest to look for _sample_dict instead seems to fix things, for a bit. But then it later stumbles on the _subsample_df key:

Traceback (most recent call last):
  File ".venv/lib/python3.9/site-packages/attmap/ordattmap.py", line 46, in __getitem__
    return super(OrdAttMap, self).__getitem__(item)
KeyError: '_subsample_df'
nleroy917 commented 1 year ago

Meeting notes from 08/29/2022

We need to change the following:

  1. _sample_df needs to be _sample_dict
  2. _subsample_df needs to be _subsample_dict

The proposed solution: Couple the keys/attributes by importing from peppy