biocommons / seqrepo-rest-service

OpenAPI-based REST interface to biological sequences and sequence metadata
Apache License 2.0
4 stars 6 forks source link

GA4GH Aliases Bug #5

Closed korikuzma closed 3 years ago

korikuzma commented 3 years ago

When fetching aliases, the ga4gh computed identifier has a GS prefix, which is not listed in the documentation.

This bug can be seen from the readme example:

  $ curl -f "http://0.0.0.0:5000/seqrepo/1/metadata/GRCh38:1"
    {
    "added": "2016-08-27T21:17:00Z",
    "aliases": [
        "GRCh38:1",
        "GRCh38:chr1",
        "GRCh38.p1:1",
            ... 
        "GRCh38.p9:chr1",
        "MD5:6aef897c3d6ff0c78aff06ac189178dd",
        "NCBI:NC_000001.11",
        "refseq:NC_000001.11",
        "SEGUID:FCUd6VJ6uikS/VWLbhGdVmj2rOA",
        "SHA1:14251de9527aba2912fd558b6e119d5668f6ace0",
        "VMC:GS_Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
        "sha512t24u:Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
        "ga4gh:GS.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO"
    ],
    "alphabet": "ACGMNRT",
    "length": 248956422
    }

I might be wrong, but I believe that the prefix should instead be SQ?

reece commented 3 years ago

Hi @korikuzma. Thanks for filing the bug.

What versions of seqrepo-rest-service and seqrepo are you using?

The current version (last update Nov 2020) does not exhibit this issue. I get:

{
  "added": "2016-08-27T21:17:00Z",
  "aliases": [
    "GRCh38:1",
    "GRCh38:chr1",
⋮
    "sha512t24u:Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO",
    "ga4gh:SQ.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO"
  ],
  "alphabet": "ACGMNRT",
  "length": 248956422
}

I have no doubt that what you see is real, and it might be that I just need to cut a new release.

For background: Early versions of seqrepo used ad hoc namespace names. I then discovered identifiers.org and, liking standards, switched to that. I tried to do this gracefully with a series of moves in both seqrepo and s-r-s that provided backward compatibility, but that only really worked when both were current. In particular, I think it's possible that using a new s-r-s, which no longer translates aliases on-the-fly, with an older seqrepo, which provides the old aliases, leads to what you see.

reece commented 3 years ago

@korikuzma : Please use the /ping endpoint and upload the results. For example:

snafu$ curl -X GET "http://0.0.0.0:5000/seqrepo/1/ping" -H "accept: application/json"
{
  "dependencies": {
    "bioutils": {
      "url": "https://github.com/biocommons/bioutils/",
      "version": "0.5.2.post3"
    },
    "seqrepo": {
      "root": "/usr/local/share/seqrepo/2019-06-20",
      "url": "https://github.com/biocommons/biocommons.seqrepo/",
      "version": "0.6.3"
    }
  },
  "url": "https://github.com/biocommons/seqrepo-rest-service/",
  "version": "0.1.3"
}
korikuzma commented 3 years ago

Hi, sorry for the delay. This is my result:

{
  "dependencies": {
    "bioutils": {
      "url": "https://github.com/biocommons/bioutils/",
      "version": "0.5.2.post3"
    },
    "seqrepo": {
      "root": "/usr/local/share/seqrepo/latest",
      "url": "https://github.com/biocommons/biocommons.seqrepo/",
      "version": "0.6.1"
    }
  },
  "url": "https://github.com/biocommons/seqrepo-rest-service/",
  "version": "0.1.3.dev0+g4c07db7.d20200713"
}

The seqrepo date that I have been using is 2020-11-27.

reece commented 3 years ago

Please clarify: Are you reporting a comment about the example in the README or an actual bug in the code that you have run recently?

The readme is old. Early version of the VRS spec did use GS, so it appears there. But, that's long gone. (I'll fix it right now).

korikuzma commented 3 years ago

Sorry for the confusion. I am getting the same output from the example in the readme, which includes "ga4gh:GS.Ya6Rs7DHhDeg7YaOSg1EoNi3U_nQ9SvO" in the list of aliases.

reece commented 3 years ago

Hi @korikuzma : Indeed, this is a version issue.

See https://github.com/biocommons/biocommons.seqrepo/blob/main/docs/changelog/0.6/0.6.2.rst. It has a link to the commit where I fixed exactly this bug.

Please update seqrepo and it should work as we expect.

korikuzma commented 3 years ago

Ah! Thanks for the help.