geneontology / amigo

AmiGO is the public interface for the Gene Ontology.
http://amigo.geneontology.org
BSD 3-Clause "New" or "Revised" License
29 stars 17 forks source link

web visibility of GO xrefs list #607

Open zhilianghu opened 3 years ago

zhilianghu commented 3 years ago

When I searched for biology related xref lists, on the first 2 Google pages are UniProt and NCBI db_xrefs lists. It's hard to find the GO consortium dbxref list although it has 266 databases while other 2 resources have only 183 and 129 databases respectively. The GO consortium dbxref list needs better visibility. Would the public web page need better META descriptions (view-source:http://amigo.geneontology.org/xrefs) or some other means can also help?

pgaudet commented 3 years ago

@lpalbou Any suggestion on this ?

Thanks, Pascale

pgaudet commented 3 years ago

@cmungall

zhilianghu commented 3 years ago

Currently the page META "description" is "AmiGO 2". Suggest to use something like: "Definitions, syntax, and list of cross-references in genomics, genetics, and life sciences databases; db_xef; dbxref; xref; database; " The page title: "Information about Cross References" could also be altered to reflect the nature of the content better, like: "A Guide to Database Cross-references in Life Sciences" Page TITLE is not picked up by Google any more? Or it could also be updated.

cmungall commented 3 years ago

Hi @zhilianghu good to hear from you!

Note that this has been low on our priorities as our xref metadata has primarily been for within GOC use. However, I think it's a generally useful piece of metadata and we may want to publicise more, e.g. via SEO/tags, putting a page on our main website, even a publication.

But I actually think the approach here should be to make the go xref metadata a spoke in a larger more distributed model. Some other things happening of relevance broader than GO

A lot of this is out of scope for our GO registry so I suggest opening a ticket on the prefixcommons repo to continue there

lpalbou commented 3 years ago

@zhilianghu it seems you are referring to two different things:

The second is a much larger discussion and if that's what you would like, it would need some clarification / examples. So I will only answer for the first part "how to find".

1) The latest go dbxrefs will always be accessible with:

2) Currently, the only mention of dbxrefs on the GO website is: http://geneontology.org/docs/GO-term-elements#database-cross-references and it only refers to xrefs for GO terms. I will add some text there as well as a link to the download-mappings section.

3) We also have a page about cross-references and mappings: http://geneontology.org/docs/download-mappings/ which probably should be linked from the GO Downloads menu (todo). Unfortunately, it doesn't describe the dbxrefs, but that's where it should be. When this is done, this will be indexed by google and other SEO.

Now, if your usage is bioinformatic in nature, I also recently published a lightweight dbxrefs handler on NPM: https://www.npmjs.com/package/@geneontology/dbxrefs

I still have to write the docs, but essentially:

import * as dbxrefs from "@geneontology/dbxrefs";

// This will return a promise while the latest dbxrefs is retrieved from the URL above
dbxrefs.init(() => {
    this.dbXrefsReady = true;
 })

// When ready, you can simply do the following to get the full URL:
let url = dbxrefs.getURL(databaseName, entityType, entityID);

entityType can be undefined in which case it will take the first record for the database databaseName in the dbxrefs.yaml.

If your usage is more about CURIE <-> IRI conversion, we also provide another library, both in javascript and java:

Please note however that IRIs are different from URLs and IRIs do not necessarily resolve to a web page. They are meant to be permanent identifiers over the web to simplify the interconnection of knowledge over multiple data providers (aka semantic web). All GO terms or GO-CAM have IRIs (see https://geneontology.github.io/docs/sparql/#resource-description-framework-rdf).

lpalbou commented 3 years ago

Here is a proposal: https://github.com/geneontology/geneontology.github.io/pull/278

And a new menu item: Screen Shot 2021-02-04 at 2 46 59 PM

zhilianghu commented 3 years ago

My suggestion was very simple: how to make search engines to catch this list for relevant keywords search on internet.

META description was only my suggested way to feed web crawlers.

Sorry if this has confused you all.

Zhiliang

From: lpalbou notifications@github.com Reply-To: geneontology/amigo reply@reply.github.com Date: Thu, Feb 04, 2021 at 04:18 PM To: geneontology/amigo amigo@noreply.github.com Cc: "Hu, Zhiliang [AN S]" zhu@iastate.edu, Mention mention@noreply.github.com Subject: Re: [geneontology/amigo] web visibility of GO xrefs list (#607)

@zhilianghuhttps://github.com/zhilianghu it seems you are referring to two different things:

The second is a much larger discussion and if that's what you would like, it would need some clarification / examples. So I will only answer for the first part "how to find".

  1. The latest go dbxrefs will always be accessible with:

  2. Currently, the only mention of dbxrefs on the GO website is: http://geneontology.org/docs/GO-term-elements#database-cross-references . To start, I can make that dbxrefs an hyperlink to our file.

  3. In addition, we also have a page about cross-references and mappings: http://geneontology.org/docs/download-mappings/ which probably should be linked from the GO Downloads menu (todo). Unfortunately, it doesn't describe the dbxrefs, but that's where we should describe it further and provide links. When this is done, this will be indexed by google and other SEO.

Now, if your usage is bioinformatic in nature, I also recently published a lightweight dbxrefs handler on NPM: https://www.npmjs.com/package/@geneontology/dbxrefs

I still have to write the docs, but essentially:

import * as dbxrefs from "@geneontology/dbxrefs";

// This will return a promise while the latest dbxrefs is retrieved from the URL above

dbxrefs.init(() => {

this.dbXrefsReady = true;

})

// When ready, you can simply do the following to get the full URL:

let url = dbxrefs.getURL(databaseName, entityType, entityID);

entityType can be undefined in which case it will take the first record for the database databaseName in the dbxrefs.yaml.

If your usage is more about CURIE <-> IRI conversion, we also provide another library, both in javascript and java:

Please note however that IRIs are different from URLs and IRIs do not necessarily resolve to a web page. They are meant to be permanent identifiers over the web to simplify the interconnection of knowledge over multiple data providers (aka semantic web). All GO terms or GO-CAM have IRIs (see https://geneontology.github.io/docs/sparql/#resource-description-framework-rdf).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/geneontology/amigo/issues/607#issuecomment-773640218, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVDWXUDSDFZDSQ6OSILTPTS5MMMRANCNFSM4XDKK7NQ.

lpalbou commented 3 years ago

Got it, thanks for the feedback Zhiliang.

When the PR is approved, the GO dbxrefs.yaml file will be referenced on the GO website and from there, it should be indexed / searchable by google. There would be more SEO optimization to do in the future though.

zhilianghu commented 3 years ago

Hi Chris – my drive was from how to make the list more readily available to the larger life science folks (my experience tells it’s not straight forward) using meta to boost it into search engines.

Actually I have a little more – as I expressed on the recent Ag-BioData GFF3 format recommendation working group – that among all 3 db_xref lists in life science fields (one at NCBI, one at UniProt, and one at GO), your xref list is best managed with a schema and format validator, and has the highest number of databases listed. On the GFF3 format writing team I asked Sierra to pass a request to you: is it possible for you to unite the 3 lists into one, so that a GFF3 validator will have an easier job to do, and database developers have a ultimate resource to reference?

I understand you have more other priorities, and this “free job” only serves the good of the larger community 😊

Zhiliang

From: Chris Mungall notifications@github.com Reply-To: geneontology/amigo reply@reply.github.com Date: Thu, Feb 04, 2021 at 01:24 PM To: geneontology/amigo amigo@noreply.github.com Cc: "Hu, Zhiliang [AN S]" zhu@iastate.edu, Mention mention@noreply.github.com Subject: Re: [geneontology/amigo] web visibility of GO xrefs list (#607)

Hi @zhilianghuhttps://github.com/zhilianghu good to hear from you!

Note that this has been low on our priorities as our xref metadata has primarily been for within GOC use. However, I think it's a generally useful piece of metadata and we may want to publicise more, e.g. via SEO/tags, putting a page on our main website, even a publication.

But I actually think the approach here should be to make the go xref metadata a spoke in a larger more distributed model. Some other things happening of relevance broader than GO

A lot of this is out of scope for our GO registry so I suggest opening a ticket on the prefixcommons repo to continue there

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/geneontology/amigo/issues/607#issuecomment-773547814, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVDWXWUNKULQ22RWRTVIKDS5LX5DANCNFSM4XDKK7NQ.