Closed suzialeksander closed 4 years ago
In paper 302528, I had to manually find and link each occurrence of pif1 in the phrase " pif1-m2".
This issue is pretty major, as marking up a paper can take an hour or so more by having to manually link each occurrence, especially if there are multiple entities not linking. Manually linking up is also very error-prone and a lot of links were missed, leaving a lot of work for the second step (checking the proof). Tagging @nathandunn as it seems to be a coding issue.
Please apply this fix to worm papers as well. For worm however, the linking should only be suppressed if the entity is followed or preceded by a colon.
https://bioentity.link/#/publication/10.1534/genetics.119.302625
Authors have a lot of 'entity;entity' expressions where the genes did not get linked. I started linking them one by one, but there are a lot making it worth while to just put in a fix for papers, especially going forward. thanks.
Words with semicolons should get linked now.
Great, I’ll check it out
On Thu, Aug 29, 2019 at 17:17 nickstiffler notifications@github.com wrote:
Words with semicolons should get linked now.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/bioentity/Bioentity.link/issues/21?email_source=notifications&email_token=AAEVKGULYRYDYHP2LKTTQJTQHBRIBA5CNFSM4IIVL4U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5QFIPY#issuecomment-526406719, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEVKGQUMURBDA33MQ2ESMDQHBRIBANCNFSM4IIVL4UQ .
--
Karen Yook
Curator / Editor WormBase Caltech / microPublication email: kyook@caltech.edu email: karen@wormbase.org email: karen.yook@micropublication.org skype name: wbkaren tel: +1(415)306-4150
This is still not working and probably adds at least 30-60 minutes to curation time on papers that have a lot of links affected, at 15-20 seconds per link at my fastest working speed. Propose to move to Showstopper level as it's increasingly frustrating to use the linkup tool.
Hi Suzie,
I'm sorry about the linking. You shouldn't have to do the linking manually at all, but if you do, it should be only in rare cases. When you see that there is a pattern to why an entity isn't getting linked, we should address it through fiddling with the script or in cases where it is an author-specific formatting, it should be fixed by Sheridan during the proof stage.
For this paper can you tell us which entities you had to link? I see things like hyphens, double semicolons and delta symbols, it could be that we need to allow entities to be followed or preceded by these characters for them to be recognized and linked.
Again, sorry about this.
Karen Yook
Curator / Editor
WormBase Caltech / microPublication
email: kyook@caltech.edu
email: karen@wormbase.org
email: karen.yook@micropublication.org
skype name: wbkaren
tel: +1(415)306-4150
On Fri, Oct 25, 2019 at 9:40 AM suzialeksander <notifications@github.com>
wrote:
> This is still not working and probably adds at least 30-60 minutes to
> curation time on papers that have a lot of links affected, at 15-20 seconds
> per link at my fastest working speed. Propose to move to Showstopper level
> as it's increasingly frustrating to use the linkup tool.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <https://github.com/bioentity/Bioentity.link/issues/21?email_source=notifications&email_token=AAEVKGSWL4JAYMFLUTFT4Y3QQMOOHA5CNFSM4IIVL4U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECI4WVQ#issuecomment-546425686>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAEVKGUUNHJUSRGIEVXSGYDQQMOOHANCNFSM4IIVL4UQ>
> .
>
https://bioentity.link//#/publication/10.1534/genetics.119.302700 I had to manually link
It looks like I need to add the following exceptions:
I am not sure what to do about the prIME1. It can't just look for substrings because there would be too many false links. Should it look for a change in case and try to link or are there set characters that it should match (I see pr and oe)?
I did not want subscripted entities linked in worm papers, however it is easier to remove links than to not have them. So if you can't do species-specific linking, opt for applying the subscript linking to all.
I am assuming that there will only be a set characters that should match wrt pr and oe, but let's ask @suzialeksander first.
Note to self: delta is being encoded as Δ
Is it important for the delta to be part of the link?
Hi Suzie,
Nick is adjusting the linking now. Do you want the delta symbols to be included as part of the linked entity?
Karen
On Sun, Oct 27, 2019 at 12:14 AM nickstiffler notifications@github.com wrote:
Is it important for the delta to be part of the link? [image: links] https://user-images.githubusercontent.com/2396480/67631074-bd628a00-f84e-11e9-9c98-be23204b8053.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bioentity/Bioentity.link/issues/21?email_source=notifications&email_token=AAEVKGSWWNU6VEMKKZJIMJ3QQU5VVA5CNFSM4IIVL4U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECKX4NY#issuecomment-546668087, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEVKGSBU3UIEW657E3UYE3QQU5VVANCNFSM4IIVL4UQ .
The allele here is rlm1
, so it doesn't match therlm1Δ
. We could add the delta to the allele lexica or create an exception that links a trailing delta when present.
I would set it to just link the trailing delta when present. Worm also has these types of variation suffixes specifically, they are gof lof gf lf dm sd
There are probably others, so if possible it would be good if there was a way to be able to modify these modifiers, rather than relying on hard coding these things.
Karen Yook
Curator / Editor
WormBase Caltech / microPublication
email: kyook@caltech.edu
email: karen@wormbase.org
email: karen.yook@micropublication.org
skype name: wbkaren
tel: +1(415)306-4150
On Mon, Oct 28, 2019 at 12:15 PM nickstiffler <notifications@github.com> wrote:
>
> The allele here is rlm1, so it doesn't match therlm1Δ. We could add the delta to the allele lexica or create an exception that links a trailing delta when present.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or unsubscribe.
I made the necessary fixes to allow those things to be automatically linked now. This includes genes prefixed with pr
or oe
, subscripts, preceded with hyphen, and a trailing delta symbol.
When I developed the tool to manually add links, it was intended to only be used for lexica that are missing from our database. In theory, everything in the database should be linked automatically. When things aren't linked, it is because the entity appears in way the tool wasn't designed to handle. Clicking "Link all" will not work in these cases because the linking tool hadn't been updated to recognize these situations. It is important to keep track of all instances where a known entity is not being linked automatically so we can update the tool to handle as many of these cases as possible and save time moving forward.
The pr and oe are not typical prefixes, although this paper clearly uses them a lot, but it's not unheard of for them to be used.
Yeast also has notation in the format of XxxN::XxxN or XxxNdelta::XxxN, etc., so if you could make sure the linking works when an entity is immediately proceeded by a colon that would be great.
I chatted with @robnash, and SGD has historically not linked the delta after an entity. However, we would much prefer there to be a consensus among the groups doing markup, to produce a more consistent view for the readers. So, if there is anyone doing markup that would argue for the delta to be linked, we would like to hear it. This may involve doing a Quick Fix for now, to get the Link All functioning, and then we could discuss this and other quirks in person at Alliance Face to Face in Dec.
Another case: TEL1/TEL1-hy909
TEL1 should be linked in both occurrences, so a prefix of / should be added too. I think some of the above changes have gone though, because one of the papers I'm currently doing is linking up pretty well. Thanks.
Great to hear. I will see what I can do about the /.
Link all failed again on https://bioentity.link//#/publication/10.1534/genetics.119.302971. Particularity affected 16 occurrences of "yRAD27" I think this time because the entity was (usually) directly preceded by the letter y, and sometimes followed by a slash. Not uncommon for us to have a gene/protein prefixed by a y or Sc even.
yRAD27
appears as:
(hFEN1/yRAD27) yRAD27-deficient yRAD27
Is this the place for bug reports about the tool?
Was working on https://bioentity.link//#/publication/10.1534/genetics.119.302435
Trying to link
rad9
in the phrase "rad9-deficient" using the link all feature. Other occurrences of "rad9-deficient" and related "rad9-deficiency" do not link and all must be linked individually. I have a screen recording of this behaviour but it seems GitHub doesn't like movies.Had similar resistance in the old tool when what I wanted to link wasn't the whole "non whitespace" word or entity wasn't italicized properly in the old tool, not sure if that matters.