[REVIEW]: Lexicon-Mono-Seq, DOM Text Based Async MSA Viewer

whedon commented 5 years ago

Submitting author: @IbrahimTanyalcin (Ibrahim Tanyalcin) Repository: https://github.com/IbrahimTanyalcin/lexicon-mono-seq Version: v0.19.0 Editor: @csoneson Reviewer: @hrhotz, @jkanche, @imallona Archive: 10.5281/zenodo.3273310

Status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/9fbf126b501ad921825f755732450278"><img src="http://joss.theoj.org/papers/9fbf126b501ad921825f755732450278/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/9fbf126b501ad921825f755732450278/status.svg)](http://joss.theoj.org/papers/9fbf126b501ad921825f755732450278)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@hrhotz & @jkanche & @imallona, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

Make sure you're logged in to your GitHub account
Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @csoneson know.

✨ Please try and complete your review in the next two weeks ✨

Review checklist for @hrhotz

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: v0.19.0
[x] Authorship: Has the submitting author (@IbrahimTanyalcin) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @jkanche

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: v0.19.0
[x] Authorship: Has the submitting author (@IbrahimTanyalcin) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @imallona

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: v0.19.0
[x] Authorship: Has the submitting author (@IbrahimTanyalcin) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

whedon commented 5 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @hrhotz, it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

PDF failed to compile for issue #1407 with the following error:

/app/vendor/ruby-2.4.4/lib/ruby/2.4.0/find.rb:43:in block in find': No such file or directory - tmp/1407 (Errno::ENOENT) from /app/vendor/ruby-2.4.4/lib/ruby/2.4.0/find.rb:43:incollect!' from /app/vendor/ruby-2.4.4/lib/ruby/2.4.0/find.rb:43:in find' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-a1723d160bb6/lib/whedon/processor.rb:57:infind_paper_paths' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-a1723d160bb6/bin/whedon:50:in prepare' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:inrun' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in invoke_command' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor.rb:387:indispatch' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/base.rb:466:in start' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-a1723d160bb6/bin/whedon:116:in<top (required)>' from /app/vendor/bundle/ruby/2.4.0/bin/whedon:23:in load' from /app/vendor/bundle/ruby/2.4.0/bin/whedon:23:in

'

csoneson commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

csoneson commented 5 years ago

@labarba (or @openjournals/joss-eics) - just to say that the manual change of the path to the submitted repository in the PRE-REVIEW issue (#1379) did not seem to carry over here. I manually changed it again above, which I think should be fine for the review, but I'm not sure whether anything else needs to be changed somewhere for a more permanent effect.

csoneson commented 5 years ago

@hrhotz, @jkanche, @imallona - please perform your review in this issue, by updating your checklist above and posting comments. If you have any questions, don't hesitate to ping me.

hrhotz commented 5 years ago

@IbrahimTanyalcin - I have made a pull request with some minor suggestions for the paper text. Also, please consider adding the names of a few examples for browsers using DOM Text + SVG. Same for MSA viewers relying on the canvas element. This might help putting the text in a wider context.

IbrahimTanyalcin commented 5 years ago

@hrhotz, ty for the suggestions:

merged your pull request
added names to the examples
added jsdelivr/npm link for downloading the script
added a minified version from jsdelivr (~40kB)
added 4 new citations, 2 tools that are similar (DOM text) to lexicon-mono-seq, and 2 that are canvas based (IGV and MSA Viewer)

jkanche commented 5 years ago

@IbrahimTanyalcin - I created a pull request with a few minor edits and fixes to the paper. Couple of things

version number of the release does not match the one in the paper
I'd consider adding this to examples
- https://github.com/dariober/ASCIIGenome
rename files in examples with specific names so that its easier to browse

IbrahimTanyalcin commented 5 years ago

@jkanche, ty for the suggestions

merged your PR
add a new reference for ascii genome
changed the example names in the repo

hrhotz commented 5 years ago

@IbrahimTanyalcin - I have been looking at the "clustalW.html" example and I have a few questions/comments:

- what is the "aa" color scheme based on?
- would it make sense to offer more than just one color set for amino acids?
- how do you deal with sequences containing Selenocysteine? I have modified your example by replacing an "A" with and "U" and the complete sequence is no longer displayed,
- why is the line describing the identity/similarity (the line containing "*",":", etc) not displayed
- I don't understand your comment: "You can also read clustal files if you provide them to LexiconMonoSeq as a string". You already provide the data as a string in the example, don't-you?
- displaying an alignment where you provide a file path would be useful.

imallona commented 5 years ago

Hi @IbrahimTanyalcin,

Thanks for the resource. I have some comments/questions regarding the rendered PDF version of the paper, and the resource and its documentation.

I noticed the paper update has fixed some typos already (dna -> DNA), but I still find some in citations. Namely,

ES6 citation is currently is listed as (International, 2018) instead of (Ecma International, 2018).
The reference for lexicon library by Tanyalcin et al. does not indicate the journal name.
Also, this paper footpage Tanyalcin, (2019). doesn't list the author's full name, but only the last name.

Regarding the tool usability, it is stated that (...) the amount of sequences is limited to your browsers capacity, recommended limit is 8000 sequences with 100000 characters long each (...). Is it possible to tabulate/stratify such recommendation, i.e. providing a limit for commodity laptops as well as to recent/powerful machines/workstations?

Regarding the tool usability and/or documentation, I acknowledge the high flexibility given to the durations of animation effects, which (if I'm not wrong) can be quite short, i.e. 17ms. As intermittent visual stimuli are known to elicit epileptic seizures to some, I wonder which are your thoughts on including a warning at the resource documentation suggesting good practices. The W3C indicates that, in order to avoid triggering seizures to photosensitive epilepsy users, blinks should be slower than three a second https://www.w3.org/TR/2008/REC-WCAG20-20081211/#seizure . Not sure of the examples' animation speeds (the ones rendered at readme.md), but if they were faster than that I would suggest slowing them down to the W3C recommendations.

IbrahimTanyalcin commented 5 years ago

@hrhotz Thank you for the well raised comments, let me go over some of these:

the "aa" (amino acid) color scheme is from a thread I found here: http://ugene.net/forum/YaBB.pl?num=1337064665 not perfect but made some sense at the time.
You can add any custom color scheme, using the instance.registerType method. It looks like this:

instance.registerType("myType",{"A":"#ffbbaa","B":"00aacc"...})

You can also specify a text color and opacity which are optional, default text color is rgb(0,0,0,0.9) and default opacity for any character is 0.8

Normally you do not have to create a new object from scratch to pass to instance.registerType as an argument, a copy function exists instance.caveManCopy on the prototype chain. Example-7(chrX10Mb.html in the repo examples folder) makes use of this. It registers a new type with new characters.

The error you encountered (things not rendering) is possibly because the new char you specified was not registered in the type and it threw Uncaught Reference Error, I am not using try/catch blocks there because I presume it would slow down execution.

For the similarity line with "*" character for Clustal.wl files, I initially thought the coloring would be sufficient to draw human perception to similar regions, but now after your comment I realize it might not be adequate. The static function that reads the clustal.wl is here and specifically this is the line that "discards" the lines with "*". I propose that we pass an optional argument to not filter out those lines. What do you think.
That is correct, I will revise that sentence, "also" is unnecessary. Thank you for pointing out.
Indeed, currently in the examples I use XMLHttp (not fetch for older browsers). I propose I add a static "fetch" function to read out clustal.wl files. An options arguments in the form of an object could be passed where keys:
- retry : specify retry count upon failure
- medium: either "local" (for file API) or "remote" (default, uses XMLHttp) The above would return a thenable object which you can chain later on. Is that ok?

IbrahimTanyalcin commented 5 years ago

@imallona Thank you for pointing out these blind spots. I will go over the citations, did not realize that they were incomplete.

For the performance, I just tested the first example and the letters of PI on an old iphone 5S to see if it did render. The animations were not butter smooth, however it handled without problems (iphone 8 and X rendered without problems and with 60fps). Since we are working with DOM, we are mainly using CPU (apart from transform css style, which creates a layer for GPU) and this could be quite variable. That being said, since sequence length is not the bottle neck and to be on the safe side, we could say for most commodity laptops 1000 sequences with 100000 letters each could run reasonably smooth (since they run on iphone 5S). I have access to variety of different devices including android, iOS and Mac. If you have specific hardware let me know so I can test it on them.

About the epileptic seizures, thank you for raising this point! I must say I am completely ignorant in this field and I think it is a good idea to at least include some sort of disclaimer. For the duration, you are correct that the lowest you can go is 17ms (this is hardcoded in window.requestAnimationFrame API). The default duration is 1500milliseconds, this controls the sliding of the text. The colored boxes are filled with a default duration of 150ms with a limit of 80 characters per line. Another parameter is also the painter count (these are parallel async renderers), by default there is only 1 painter, an example where you can spawn more painters is here.

What above means is that, by default if a display has 15 sequences with a screen with of 120 characters, it would take Math.ceil(120/80) 15 150 + (15-1) * 17 ~= 4750ms to fill the screen and the screen would briefly "update" 30 times within this period, that is roughly 6 "updates" per second which is twice the recommended range in the link you kindly provided.

I suggest we add a disclaimer, and for some examples like this, we could increase the duration to 350ms. What do you think?

hrhotz commented 5 years ago

@IbrahimTanyalcin - Thanks for your feedback, please allow me to go a little bit deeper into some of my comments (which I now have numbered) wrt to the "clustalW.html" example:

- your chosen "aa" color scheme does make sense. Though, you should probably acknowledge the person who set it up originally.
- the option to add a custom color scheme is great. It might help if you document this (with a code example) in addition to the code example shown for the "asciiArt" and/or add the text you have written for "Example-7(chrX10Mb.html)" in this thread to the documentation. However, my actual comment was: whether it would make sense to have several pre-defined color schemes which you can chose from. As an example have a look at Jalview (http://www.jalview.org/)
- having a "U" in the sequence should not threw an 'Uncaught Reference Error'. Nowadays, "U" (aka Selenocysteine) is very common in amino acid sequences (e.g.: https://www.uniprot.org/uniprot/P07203#sequences ).
- an optional argument sound very good
- sure, whatever is the easiest way to implement it, is fine

imallona commented 5 years ago

Hi @IbrahimTanyalcin,

Thank you for your feedback. I see, so setting thresholds for performance is a difficult task due to the huge variety of hardware available. My comment arose from the paper's sentence The amount of sequences is limited to your browsers capacity, recommended limit is 8000 sequences with 100000 characters long each, which I found not very useful as it does not specify what hardware that recommendation applies to (as you say, CPU, operating system/browser). I think that, if updated to add the recommendations for commodity laptops/phones you kindly posted here (1000 sequences with 100000 letters each), that sentence would be more informative, what do you think?

As for the update rate I appreciate your reflection and agree with your suggestions of adding a disclaimer and slowing down some examples.

Aside, as a minor comment unrelated to the former, I just noticed the paper.md refers to version 0.17.3 (...)the library is about 42kB (as of version 0.17.3)(...); you might want to update both library version and its size. Also noticed that dna and ascii should be capitalized.

csoneson commented 5 years ago

👋 @hrhotz, @imallona, @jkanche, @IbrahimTanyalcin - just wanted to check in to see how things are going here! I can see that there are some unticked boxes in all the reviews, as well as some comments for @IbrahimTanyalcin. It would be great to have a quick status update - and please don't hesitate to let me know if there is anything I can do to facilitate.

IbrahimTanyalcin commented 5 years ago

@csoneson @hrhotz @imallona , The reviews were great! They just covered a lot of blind spots that I never thought of. I had to pause for a week, because I am actually integrating this library into mutaframe.com:

example

Not in production yet, but almost there. I'm getting back to the comments at the end of this week. Hope that is ok.

csoneson commented 5 years ago

👋 @jkanche - just checking in on your review status. There are several unticked boxes in your checklist - could you provide some comments for @IbrahimTanyalcin explaining your current reservations in some more detail?

jkanche commented 5 years ago

@csoneson, I did send a pull request with suggestions and updates to the paper, and I was waiting to hear back from @IbrahimTanyalcin until he's ready for the review again since he said he's working on new updates and integration in his last comment

csoneson commented 5 years ago

Thanks for the update @jkanche. @IbrahimTanyalcin - do you have an estimate of when the reviewers could have an updated version to look at (you mentioned above that you would get back to the comments at the end of last week)?

IbrahimTanyalcin commented 5 years ago

@csoneson, getting back to it this weekend :) I merged that pull request I think.

csoneson commented 5 years ago

👋 @IbrahimTanyalcin - any updates on the revisions?

IbrahimTanyalcin commented 5 years ago

:wave: @csoneson - I finished almost adding it to MutaFrame. I was thinking of adding Web Component based constructor to it, however it might be far fetched. I'll only push what has been requested by tomorrow:

Selenocysteine
Support for clustal.wl
Doc updates I'll send a notification when I'm finished

IbrahimTanyalcin commented 5 years ago

:wave: @csoneson @hrhotz @jkanche @imallona

I hope you all are doing great, I finally could bring some updates:

Added disclaimer for seizures https://github.com/IbrahimTanyalcin/lexicon-mono-seq#disclaimer
Updated paper.md, I might have missed a couple of points
Added Async operations, these are very similar to ES6 Promises, but instead they are written in ES5 and do not require polyfills https://github.com/IbrahimTanyalcin/lexicon-mono-seq#async-operations
Added tutorial for registering new color schemes/types https://github.com/IbrahimTanyalcin/lexicon-mono-seq#registering-new-colors
Added ~10 color schemes from Jalview https://github.com/IbrahimTanyalcin/lexicon-mono-seq#10-choosing-color-schemes
Most color schemes now include the letter "U" https://github.com/IbrahimTanyalcin/lexicon-mono-seq/blob/master/lexiconMonoSeq.v0.18.0.js#L169
Added 3 new examples:
A new static method parseURL added, internally used by fetch https://github.com/IbrahimTanyalcin/lexicon-mono-seq#static-methods
A new async operator ìsFontLoaded is added, checks for the users whether the font is loaded, then returns a thenable https://github.com/IbrahimTanyalcin/lexicon-mono-seq#a-word-about-the-fonts

csoneson commented 5 years ago

Thanks @IbrahimTanyalcin. @jkanche, @hrhotz, @imallona - could you please have a look at the updated submission when you have a chance?

hrhotz commented 5 years ago

@IbrahimTanyalcin - I am still looking at the "clustalW.html" example referring to the same points as in my comments from April 29/30:

solved
it is great, you have added the other color schemes. But it is not clear to me, how I change from "aa" to "jalviewClustal " in this example
solved
after adding ",{conservation:true}" it displays the conservation. However, in the example, there are 'tab' instead of blanc spaces, and therefore the line is not displayed correctly.
solved
excuse my ignorance, but it is not clear to me how I can replace the hardcoded clustalW alignment with a link to an alignment file.

hrhotz commented 5 years ago

@csoneson WRT "General checks": I have added the "OK" for 'Repository' and 'Version'...though, I am still confused about the link for "repository url" and I am not sure about the version: is there any update mechanism in order to change it to 18.0

IbrahimTanyalcin commented 5 years ago

@hrhotz No worries, I will write a small section to explain how links can be used. The ClustalW normally specifies a single whitespace, the second example is more recent (https://github.com/IbrahimTanyalcin/lexicon-mono-seq/blob/master/examples/asyncFetch.html), I'll try if I can update the converter to except tabs. I'll also add a section of how you can change it from "aa" to other color scheme from a String input. @hrhotz, many thanks for pointing these blind spots!

hrhotz commented 5 years ago

@IbrahimTanyalcin WRT tabs/single space: no need to write a converter to except tabs...it could cause other issues. It is better to fix the example (If time permits, I will make a pull request later).

imallona commented 5 years ago

Hi @IbrahimTanyalcin, thanks for adding the disclaimer; let me check the compiled version of the updated paper.md. Minor comment on versioning: noticed the paper refers to v0.18 (which is the latest tag) but this issue refers to an old v0.15.12, which is not the latest release.

@whedon generate pdf

csoneson commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

csoneson commented 5 years ago

@hrhotz, @imallona - the version in this issue is set at submission time. When a submission is eventually accepted, the version at that time is archived and a DOI is assigned. So at this point, it is not a problem that the version given here differs from the one in the paper. @hrhotz - for the repository link, this is ok too. The original link was incorrect at submission; this was later fixed, but the link in the checklist is still the submitted one.

jkanche commented 5 years ago

Hello @csoneson, I'm currently on travel with limited internet access, I will get to this by the end of this week. Sorry for the delay!

imallona commented 5 years ago

Thanks @csoneson for the explanation on versioning, and for compiling the proof.

@IbrahimTanyalcin could you please fix the International, E citation to be rendered as Ecma International? I also noticed that the Github, 2018 reference supporting (...) utility functions that can be hooked to custom buttons as desired points to github.com instead of linking a repository, is this correct?

jkanche commented 5 years ago

I have a few comments on the paper.

Several MSA viewers rely on rendering characters as an image using the canvas element. Although reliable and convenient, this can result in 2 problems: - Higher memory usage, - Blurred text (increasing resolution increases memory usage)

The blurred text is probably because of a HiDPI display. you probably have to calculate the pixel density and then create a canvas with larger resolution, but yes, higher resolution uses higher memory.

The amount of sequences is limited to your browsers capacity, recommended limit is 1000 sequences with 100000 letters each for commodity laptops/mobile devices.

Browsers can show any amount of text. I think the limitation here is with the number of SVG elements you can add to the DOM. This is why canvas is preferred over d3 for large data sets (a single canvas element vs thousands of SVG nodes). One advantage you have here is there's no interactivity with your SVG elements. Also was this number (1000 sequences, each 100000 bp) based on some benchmark testing you did with the library ?

I don't understand the GitHub and NPM citations at the end of the article.

I was thinking, moving the clustal examples to the top might be better.

IbrahimTanyalcin commented 5 years ago

:wave: @csoneson @hrhotz @imallona @jkanche

I think I'll be able to do corrections and the section I promised to @hrhotz over the weekend. Just to quickly answer @jkanche,

Browsers can show any amount text

That I guess for certain limits, that seems correct. However let's put that to test:

https://stuff.mit.edu/afs/sipb/contrib/pi/pi-billion.txt

This loads about a billion letters of pi, as the page grows you'll notice scroll performance will deteriorate. This means that even a single DOM phasing element with a single textNode has a cost. For that reason I have taken this approach:

Whatever not visible on the current dimensions of the viewport don't render,
If previous textNodes exist, update them

I think the limitation here is with the number of SVG elements you can add to the DOM

This is also not completely correct, for instance here:

https://distreau.com/lexicon-mono-seq/examples/PI10Million.html

We never have 10 million svg rects here. They are only added as much as we need.

Moreover if they are removed, they are NOT garbage collected, they are saved here and are ultimately transferred in bulk with document fragments

On top of that the rendering process is completely async, and it is rendered in pieces(here and here). If scrolling occurs before the scene is done, the elements are recycled and repositioned.

The animations are not calculated dynamically, they are cached, and you can pretty much define any sort of function using 2 control points of cubic bezier

So when above combined, if traditional canvas without GPU is used, this approach (DOMText + svg) will be mostly faster and have lighter memory footprint. I did several tests my self, you can also check the performance from chrome devtools.

Although current approach has to use CPU mainly, in real life comparison with state of the art MSA tools that do use GPU, I have not seen performance differences.

I did the 1000x10000 benchmark on iphone5, it was good enough so I said let's keep it there. For desktop it can run up to 8000, gets slower afterwards because I have animations.

One advantage you have here is there's no interactivity with your SVG elements

You can interact with them, there are functions to get their relative positions here, these functions return the entire sequence ledger with DOM nodes, so you can pretty much do anything with them without impacting performance

jkanche commented 5 years ago

That I guess for certain limits, that seems correct. However let's put that to test:

https://stuff.mit.edu/afs/sipb/contrib/pi/pi-billion.txt

The issue here is not with the text, its with the transfer time, since the browser is rendering the text and streaming chunks of the text file at the same time. This causes the lag, but once everything is in memory, the page loads the entire text pretty quickly. for me, It took 5.6 minutes (from devtools) to transfer the entire file from the server to the client.

The amount of sequences is limited to your browsers capacity, recommended limit is 1000 sequences with 100000 letters each for commodity laptops/mobile devices.

I think the limitation here is with the number of SVG elements you can add to the DOM

My comment was based on the statement in the paper. So if the page never renders the entire sequence(s), I think the paper doesn't have to mention this as a limitation ?

if traditional canvas without GPU is used, this approach (DOMText + svg) will be mostly faster and have lighter memory footprint. I did several tests my self, you can also check the performance from chrome devtools.

were these tests documented anywhere in the GitHub repository ?

I didn't know that you can add interactivity to the elements, thanks for clarifying that up.

IbrahimTanyalcin commented 5 years ago

@jkanche,

The limitation 1000x10000 is a good idea due the ledgers (these are js objects) that keep track of each sequence row. The string length per row is not that important, you can have 1x10^6 letters or so (if I remember correctly string operations under the hood was reliant on C++ on chrome, so they are fast.) But the amount of track count starts to tire out the CPU.
no I have not added the profile files from chrome. Jsdelivr has 50mb package limit. However I can upload them on one of my domains so people can pull out if they need to and give then links on the repo. Do you have any preference for software to compare with? I'll do them tomorrow

IbrahimTanyalcin commented 5 years ago

:wave: @csoneson @hrhotz @imallona @jkanche

Here are some updates:

The npm and github references are intentional yes, I thought I'd give credit to them for all the hosting etc.
The Ecma International should be in full now.
A new version 0.19.0 is released, the async operators are almost identical to how Promises work.
Exercise order changed with suggestions of @jkanche:
- https://github.com/IbrahimTanyalcin/lexicon-mono-seq#examples
Async operator section is updated, and how both observer or promise pattern can be used is explained:
- https://github.com/IbrahimTanyalcin/lexicon-mono-seq#async-operations
With suggestions of @hrhotz, reading Clustal and JSON sections are added:
- https://github.com/IbrahimTanyalcin/lexicon-mono-seq#reading-remote-clustal-files
- https://github.com/IbrahimTanyalcin/lexicon-mono-seq#reading-json-files
I have forgotten to add posOffset:
- https://github.com/IbrahimTanyalcin/lexicon-mono-seq#updating-data
With suggestions of @jkanche, a performance section has been added, seems like lexicon is a bit faster with better load times, however I tried to emphasize that this is not about performance, it is about providing a more familiar DOM text based alternative. The profiles are downloadable in an indexed location listed at below link:
- https://github.com/IbrahimTanyalcin/lexicon-mono-seq#performance
If you want to see the tests live, I have 2 fiddles, one for lexicon-mono-seq and one for the other, you need to wait a bit for the MSAViewer:
- lexicon-mono-seq: https://jsfiddle.net/ibowankenobi/apjcxq1w/
- MSAViewer: https://jsfiddle.net/ibowankenobi/s97g4vxa/
A remedies for bugs and os specific glitches has been added:
- https://github.com/IbrahimTanyalcin/lexicon-mono-seq#remedies

jkanche commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago

Attempting PDF compilation. Reticulating splines etc...

whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

csoneson commented 5 years ago

@IbrahimTanyalcin Looking at the remaining points in the reviewers' checklists above, could you add a couple of statements to the README with guidelines for people who may want to contribute/seek help, and pointers to how to best test the functionality of the software (including how to verify that the output is correct)? There is one reference (Robinson et al) where the https://doi.org/ part of the DOI should be removed. @hrhotz, @imallona, @jkanche - please comment if you have additional comments, or let me know (and fill the checklist) if you are happy with the current state. Thanks!

imallona commented 5 years ago

Hi @csoneson and @IbrahimTanyalcin, I'm happy with the current version, thanks.

IbrahimTanyalcin commented 5 years ago

:wave: @csoneson

added contributing section to readme
added eslint and js-beautify templates if needed for PRs
added a doi field to Robinson et. al.

openjournals / joss-reviews