pepkit / pepdbagent

Database for storing sample metadata
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Return value of `get_anno` function #15

Closed nleroy917 closed 1 year ago

nleroy917 commented 1 year ago

I am not sure the best route to take, but it would be nice to split up some of the get_anno functionalities. For example, when I run the following:

>>> db.get_anno(namespace='nfcore')
{
  'demo_rna_pep': 
    {
        'id': 176, 
        'namespace': 'nfcore', 
        'anno_info': {
            'proj_description': None, 
            'n_samples': 5
        }
    }, 
  'demo_rna_derived': {
        'id': 177, 
        'namespace': 'nfcore', 
        'anno_info': {
            'proj_description': None, 
            'n_samples': 5
        }
    }
}

it would be nice to only receive the annotation info for the namespace rather than a list of annotations for each project in a namespace - if that makes sense. In other words, I am proposing that there be two functions:

  1. get_project_annotation("nfcore/demo_rna_pep")

This will return something like:

{
        'id': 177, 
        'namespace': 'nfcore', 
        'anno_info': {
            'proj_description': None, 
            'n_samples': 5
        }
    }
  1. get_namespace_annotation("demo")

This will return something like:

{
  'namespace': 'demo',
  'n_projects': 12,
  'n_samples': 876
}

Obviously, some of this might need to be indexed when loading the PEP's into the database. Otherwise, if you are clever with SQL, you might be able to construct a query to do it.

khoroshevskyi commented 1 year ago

There is possibility to retrieve just n_samples from each project, that we can sum it, so I think it's kinda solution. Do you need any other information annotations?