pepkit / pepdbagent

Database for storing sample metadata
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Namespacing/Registry Path Issues/Discussion #9

Closed nleroy917 closed 1 year ago

nleroy917 commented 1 year ago

There are two things/ideas I wanted to bring up: 1.) How we manage namespaced projects, and 2.) How we might manage any potential "global" or "official" projects.

Namespaced Projects

At least for the use of pephub it would be nice to agree on a standard or consistent namespacing/registry scheme. Alex and I have discussed this a bit. My personal opinion is that the PepAgent API should have all getters/setters accept either a registry path (namespace/project_name) or the two required variables to locate the project. I'm really open to discuss this and what would be best, however, since pephub might not be the only consumer of this API.

An example:

pepdb = PepAgent(
   user="postgres",
   password="docker"
)

# GETTER
# option 1
proj = pepdb.get_project("demo/basic")

#option 2
proj = pepdb.get_project(
    namespace="demo",
    project_name="basic"
)

# SETTER
proj = peppy.Project("/path/to/confg")

# option 1
pepdb.upload_project(
    project=proj,
    registry="demo/basic"
)

#option 2
pepdb.upload_project(
    project=proj,
    namespace="demo",
    project_name="basic",
)

Global Projects

Docker has a neat convention for denoting official images. They utilize an underscore (_) in the registry path to indicate that the particular image is an "official" image. More on that here. Alex and I discussed having "global" or "official" PEPs that the maintainer of a PEPHub instance might want to publish along with the other namespaced PEPs, they could be denoted with an underscore in the registry path.

For the API of PepAgent, if only a project_name is specified, we can assume its an official PEP:

pepdb = PepAgent(
   user="postgres",
   password="docker"
)

pepatac = pepdb.get_project("pepatac")

# "under the hood"
# ...
if namespace is None:
    namespace = "_"
    return self.get_project(f"{namespace}/{project_name}")
nsheff commented 1 year ago
  1. You should use ubiquerg.parse_registry_path for this. This lets you convert between the two easily. So, it makes it easy to accept both or one and get the other, or whatever.
  2. I'm not sure we need to complexity of "official" PEPs -- we can just have a lab namespace that we use for our PEPs: databio/{project_name}. The "official" designation brings too many questions of authority, review, maintenance, etc. Use the github model, where everything has a namespace.
nsheff commented 1 year ago

@Khoroshevskyi -- is this an issue that @rafalstepien could help tackle?

This issue is kind of a bottleneck right now for @nleroy917 transitioning pephub to using the database back-end.

khoroshevskyi commented 1 year ago

This is not big issue, So I think I will have update soon. I am working on it right now.

khoroshevskyi commented 1 year ago

I have changed GET function now it looks like this:

 def get_project(
            self,
            registry: str = None,
            namespace: str = None,
            name: str = None,
            id: int = None
    ) -> peppy.Project:
        """
        Retrieving project from database by specifying project name or id
        :param str registry: project registry
        :param str namespace: project registry [should be used with name]
        :param str name: project name in database [should be used with namespace]
        :param str id: project id in database
        :return: peppy object with found project
        """

So now there is possibility to get projects by registry