pombase / canto

The PomBase community curation tool
https://curation.pombase.org
Other
19 stars 7 forks source link

UNIPROT TOOL: (important) export of data #535

Closed pombase-admin closed 9 years ago

pombase-admin commented 11 years ago

At the moment, Ruth has to go into each individual curation session in order to export data (download data option - she doesn't actually use this as it is somewhat broken and therefore rather manually transfers the annotations to protein2GO). Even if she had used this option then it would be very time-consuming when there are a lot of sessions to export. Furthermore, the annotation extensions do not get exported this way (they are not shown in the table).

She would like something a long the lines of....

  1. An option in the admin panel to export data from selected curation sessions, in a single flat-file format, that she can send on to Tony. 1b. Annotation extensions must also be exported
  2. Sessions that have already been exported should be 'movable' into a section for already exported sessions so that she knows which ones she has already sent to Tony (so that he doesn't have to filter for redundancy). Moving into two sections might not be the best way but she needs a way to distinguish sessions already dealt with from the new ones.

Maybe it would be useful to her if things were date-stamped as well (when things were submitted for approval). This is so that she will know what things she might have submitted to GO because she will know the dates).

In the last few months she believes this is why she hasn't bothered using the tool more (can't export the data neatly) so it sounds quite important

Original comment by: Antonialock

pombase-admin commented 11 years ago

Ruth asked for this ages ago but I didn't get to it. It needs a little bit of thought so it will probably take a few days to implement.

Original comment by: kimrutherford

pombase-admin commented 11 years ago

I'll start work on this now.

I like your suggested workflow. How does this sound:

Once a session is exported it won't be editable, as that would cause confusion. If annotation needs to change later, the user will need to open a new session.

One problem with this plan is that it's hard to cope with deleting annotation. It doesn't matter for PomBase because we are or will be sending a fresh GAF file to GO central as needed. If Ruth is just sending GAF files to Tony for inclusion in GOA, she'll have to arrange for annotation to be deleted manually when that's needed. That maybe not a big problem for her.

Original comment by: kimrutherford

pombase-admin commented 11 years ago

I think that sounds OK too. I'm sure th deletion will be OK, because Ruth will be able to do that through protein2GO. VAl

Original comment by: ValWood

pombase-admin commented 11 years ago

Also, for this one, if this is not completed BEFORE the workshop it doesn't matter so much, as long as it is available after to export the completed sessions, so if you still need flexibility you could do this last...

Original comment by: ValWood

pombase-admin commented 11 years ago

email correspondence:

Lovering, Ruth

Apr 15 (2 days ago)

to Nilsson, kim_rutherford Dear Antonia and Kim

I hadn't really thought in details about the current use of the tool and the possible future use of the tool.

For our current use, getting scientists to submit annotations, having them reviewed by a curator (me) and then sending the annotations to Tony, the key aspect is just to minimise the curators (my) time and to train people to use a tool which they do not need to be given specific access to (which is the problem with getting the community to use Protein2GO). Once the CANTO annotations are imported (by Tony) into Protein2Go the Protein2GO tool can be used to edit the protein record if needed by a curator.

Therefore I think your suggestions are excellent, a couple of comments in line after>.

On the front page, under "Tools" add an "Export curation" link that downloads the annotation
It will only download the data from APPROVED sessions
Once it's exported, mark the sessions as EXPORTED (instead of APPROVED)

I am assuming that the tool mark the sessions as EXPORTED (instead of APPROVED)

Once a session is exported it won't be editable, as that would cause confusion.

As described above, we will make edits using Protein2GO so we won't need to edit these records.

So this is all perfect for the community annotation of proteins. Things are still up in the air wrt annotation of non-coding RNAs. This grant won't start until August. There is some talk of Protein2GO being adapted to take RNA central Ids, however if this does not happen we will be very interested in using the CANTO tool to create annotations for export to Rfam. For these annotations we will need a system more like the one that PomBase uses, as it is probably easier for everyone if we export one file with all our non-coding RNAs which contains any edits/deletions etc of the previous file. However, if it is OK with you I think we should wait until the end of the summer and then see how things stand.

Best and many thanks Kim Rutherford

Apr 15 (2 days ago)

to Ruth, Nilsson On Monday 15 April 2013 at 10:15:24, Lovering, Ruth wrote:

  • On the front page, under "Tools" add an "Export curation" link that downloads the annotation
  • It will only download the data from APPROVED sessions
  • Once it's exported, mark the sessions as EXPORTED (instead of APPROVED)

I am assuming that the tool mark the sessions as EXPORTED (instead of APPROVED)

Yep, sorry, that wasn't clear. The tool will automatically mark the sessions as "EXPORTED" so they don't get exported twice.

Once a session is exported it won't be editable, as that would cause confusion.

As described above, we will make edits using Protein2GO so we won't need to edit these records. So this is all perfect for the community annotation of proteins.

Great. I'll get on with implementing that.

Original comment by: Antonialock

pombase-admin commented 11 years ago

Also, Ruth says that the source for her annotations in the exported GAF file should be "BHF-UCL". Should we rename the Uniprot tool in that case? Or create one just for Ruth/UCL?

Original comment by: kimrutherford

pombase-admin commented 11 years ago

May as well rename it. UniProt tool is a bit of an ugly name anyway seeing that uniprot doesn't really have anything to do with it - we just use their protein identifiers :)

Original comment by: Antonialock

pombase-admin commented 11 years ago

We don't call it "UniProt" tool anymore officially Here http://curation.pombase.org/ Generic Gene Ontology Implementation: Curate GO annotations for proteins, using UniProtKB identifiers.

and in the tool itself http://curation.pombase.org/uniprot Generic GO Community Curation

I think for now we should switch to "GOC" and then i) find out if we can have a "source ID" for community curation ii) For situations like Ruth's for now she/we can just do a substitution on the GAF GOC-> BHF-UC, or later Kim can fix it up so when the export is done the exporter can select from a number of "allowed" sources I don't think we should spawn another implementation just yet ....

(We will need another implemtation for Ruths RNA projec, because that will have a different ID range, so that one can be BHF-UC)

Original comment by: ValWood

pombase-admin commented 11 years ago

The export part of this is done. I'm waiting for feedback from Ruth.

Once a session is exported it won't be editable, as that would cause confusion.

I'll do that next.

I think for now we should switch to "GOC"

I need to do that too.

Original comment by: kimrutherford

pombase-admin commented 11 years ago

Exported sessions are no longer editable, but you can view them in read-only/review mode.

Just to check: currently the "uniprot" tool has the title "Generic UniProt accession GO annotation tool demo". What should it be instead? "GOC annotation tool" or something longer?

Original comment by: kimrutherford

pombase-admin commented 11 years ago

I agree, the name isn't the most elegant; Wouldn't 'GO annotation tool' suffice? Or GO curation tool

Original comment by: Antonialock

pombase-admin commented 11 years ago

Wouldn't 'GO annotation tool' suffice? Or GO curation tool

Sounds good to me.

I think we didn't go for a title that simple initially as we wanted to avoid making it sound like an official GO tool. I don't know if that actually matters though.

Original comment by: kimrutherford

pombase-admin commented 11 years ago

See also: http://sourceforge.net/p/pombase/curation-tool/522/

Original comment by: kimrutherford

pombase-admin commented 11 years ago

Original comment by: ValWood

pombase-admin commented 11 years ago

Lets close this if all of the export is done and open a new ticket to deal with the name of the Uniprot tool and related issues

Reopen if there is anything to do with export that is not completed

Original comment by: ValWood