theosanderson / taxonium

A tool for exploring very large trees in the browser
http://taxonium.org
GNU General Public License v3.0
98 stars 16 forks source link

goes over 100% #90

Closed maximilianh closed 2 years ago

maximilianh commented 2 years ago

when loading the public sars-cov-2 tree it goes to 122% or so

niemasd commented 2 years ago

I just saw this as well:

screen
maximilianh commented 2 years ago

I just ran into this today while showing taxonium to the BV-BRC group that is behind VIPR and a few other viral genomics tools.

On Wed, Jan 19, 2022 at 10:14 PM Niema Moshiri @.***> wrote:

I just saw this as well: [image: screen] https://user-images.githubusercontent.com/4195510/150214739-2e623121-cebc-4a69-a76f-1d003244e2eb.png

— Reply to this email directly, view it on GitHub https://github.com/theosanderson/taxonium/issues/90#issuecomment-1016874412, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TNGKFCP2RA2GFEGMXDUW4SUJANCNFSM5J2MLJSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

niemasd commented 2 years ago

I looked through the source code, and I imagine this part is where the bug might be:

https://github.com/theosanderson/taxonium/blob/279162e7596c1f49acbf5a14c837c55865ae40e0/src/Taxonium.jsx#L131

            let percentCompleted = Math.floor(
              1 * (progressEvent.loaded / 50000000) * 100
            );

Should the denominator be hardcoded to 50000000, or should it be computed based on what is being downloaded? The following might help:

https://stackoverflow.com/a/63067578

theosanderson commented 2 years ago

Hi folks,

Sorry to ignore this issue for so long. The reason this is hardcoded is that when files are served as bare .pb files but under gzip compression (i.e. content-encoding: gzip) (which admittedly they are no longer for now on the main site since the uncompressed file is too big for the repo right now, but is preferable for performance) - it is not possible I think to know their size in advance. The number still gives some indication that something is happening. This is not a priority for me right now as I am working on bigger changes that would obviate this, but thanks for flagging and I'll leave the issue open.

maximilianh commented 2 years ago

As a user, you wait for this to reach 100%, and then suddenly it goes over it and you have no idea what's going on and think this is a bug.

This is a minor bug technically but a very very visible problem for new users.

Like Microsoft Windows, I'd rather set this to 10 million than 5million. It's not a problem if the bar stops at 70%, but it's a problem for the human user if the bar stops at 130%, especially since the user is waiting for it to go to 100%.

Also, you can get the file size with the HTTP header as Niema pointed out and you know the compression factor, it won't vary a lot, so you can estimate quite well the total size after compression. But that's a little more work.

On Fri, Jan 21, 2022 at 4:22 PM Theo Sanderson @.***> wrote:

Hi folks,

Sorry to ignore this issue for so long. The reason this is hardcoded is that when files are served as bare .pb files but under gzip compression (i.e. content-encoding: gzip) (which admittedly they are no longer for now on the main site since the uncompressed file is too big for the repo right now, but is preferable for performance) - it is not possible I think to know their size. The number still gives some indication that something is happening. This is not a priority for me right now as I am working on bigger changes that would obviate this, but thanks for flagging and I'll leave the issue open.

— Reply to this email directly, view it on GitHub https://github.com/theosanderson/taxonium/issues/90#issuecomment-1018606237, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TLXZM3K5AZMQL4XWEDUXF24RANCNFSM5J2MLJSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

theosanderson commented 2 years ago

bumped this value up - watch for a much more satisfactory solution soon..

niemasd commented 2 years ago

@theosanderson

it is not possible I think to know their size in advance

Hmm, I'm not sure about that. I believe this information is in the gzipped file's header:

an 8-byte footer, containing a CRC-32 checksum and the length of the original uncompressed data, modulo 232 https://en.wikipedia.org/wiki/Gzip#File_format

But yeah, might not be worth investigating to fix the numerical percentage. I would recommend at least removing the number and replacing with some other indicator of progress (e.g. dots)