Open jeet-vora opened 1 year ago
Please tell me what exactly the columns should be. What I have for now is (placeholder from ncbi_linkouts):
$ head unreviewed/fruitfly_protein_flybase_linkouts.csv
ProviderId,Database,UID,URL,IconUrl,UrlName,SubjectType,Attribute
10227,Protein,FBgn0263772,https://glygen.org/protein/M9PJ12,,,,
10227,Protein,FBgn0000635,https://glygen.org/protein/P34082,,,,
10227,Protein,FBgn0011638,https://glygen.org/protein/P40796,,,,
10227,Protein,FBgn0013726,https://glygen.org/protein/P40797,,,,
10227,Protein,FBgn0004389,https://glygen.org/protein/P40794,,,,
10227,Protein,FBgn0000719,https://glygen.org/protein/P40795,,,,
10227,Protein,FBgn0010333,https://glygen.org/protein/P40792,,,,
10227,Protein,FBgn0010341,https://glygen.org/protein/P40793,,,,
10227,Protein,FBgn0011656,https://glygen.org/protein/P40791,,,,
@rykahsay
The output files is not be
processed like above. Please see the below instructions to create the output file.
protein_glygen_flybase_linkout.tsv
(format is tsv
)
fruitfly_protein_xref_flybase.csv
Please ensure the headers and case of the headers are in the same format as shown below. There are four headers.
FlyBase ID | DBNAME | DBID | DBURL |
---|---|---|---|
FBgn0032219 | GlyGen | Q9VKZ5-1 | Q9VKZ5-1 |
FBgn0053303 | GlyGen | Q76NQ0-1 | Q76NQ0-1 |
FBgn0041723 | GlyGen | Q76NQ1-1 | Q76NQ1-1 |
xref_id
into FlyBase ID
field of the output file.GlyGen
to DBNAME
for all entries.uniprot_canonical_ac
field to DBID
and DBURL
Output file name: protein_glygen_flybase_linkout.tsv
The base URL information for protein details page will be relayed in a different file.
@katewarner
Add this protein_glygen_flybase_linkout.tsv
into the masterlist
as a TSV. It will not be used for API. Also create a BCO by adding relevant info from the ticket to the usability domain.
I have also uploaded the glygen_dbinfo.tsv into SP that will be shared with FlyBase after Robel creates the output file.
Check now
$ head unreviewed/protein_glygen_flybase_linkout.tsv
FlyBase ID DBNAME DBID DBURL
FBgn0263772 GlyGen M9PJ12-1 M9PJ12-1
FBgn0000635 GlyGen P34082-1 P34082-1
FBgn0011638 GlyGen P40796-1 P40796-1
FBgn0013726 GlyGen P40797-1 P40797-1
FBgn0004389 GlyGen P40794-1 P40794-1
FBgn0000719 GlyGen P40795-1 P40795-1
FBgn0010333 GlyGen P40792-1 P40792-1
FBgn0010341 GlyGen P40793-1 P40793-1
FBgn0011656 GlyGen P40791-1 P40791-1
@rykahsay @jeet-vora
I've added the dataset into the masterlist
file. I also checked the generated dataset and the Flybase IDs appear to be mapped to the correct GlyGen-UniProt IDs.
https://wiki.flybase.org/wiki/FlyBase:Links_to_and_from_FlyBase#Links_from_FlyBase
Links from FlyBase
FlyBase supports linkouts from any FlyBase object that has a stable FlyBase ID (e.g. FBxx[0-9]+) and a web report. Databases suitable for this kind of linking to FlyBase are those with mature data structures whose data are expressed in terms of FlyBase genetic objects that carry stable identifiers or as sequences that can be mapped to the reference sequence of a Drosophila species. FlyBase currently accepts linkout data in a simple spreadsheet table (see below), plus a summary record for the external database with link information and name. We are happy to consider additional linkout databases. Please contact us if you would like to contribute links to your database.
FlyBase-curated links and linkouts are displayed on the Report Pages in the most appropriate section of the Report. Linkouts are indicated by a Linkout label in parentheses after the field label. In addition, on the Gene Report, all FlyBase-curated links and linkouts are also grouped together in a single External crossreferences & Linkouts section. How to establish linkouts
Please note that if you are establishing a single type of linkout between FlyBase and your site then only a single linking table and database information file is required. If you want to establish multiple types of linkouts then you need to submit a linking table and database information file for each type. Linkout requirements
Linkout Submission Format Link table
The link table format is a simple 4 column tab delimited file. The description of the columns in order is show below. The filename of this file must use the form