earlyprint / earlyprint.github.io

Homepage for the EarlyPrint Project: Curating and Exploring Early Printed English
https://earlyprint.org/
2 stars 2 forks source link

performance warning? #25

Open martinmueller39 opened 4 years ago

martinmueller39 commented 4 years ago

When you open a large file in oXygen, it gives you a warning that performnce may slow down. oXygen also can disable certain functions for very large files, but that's irrelevant to our purpose.

Would it be helpful add warnings for large files? There are not quite 1,000 texts longer than 250,000 words (Hobbes' Leviathan). 300,000 words add up to about 20MB when the file is annotated. Performance with files of that size is perfectly adequate, if not lightning fast.

There are ~ 300 files with 500,000 or more words. I leafed through pages in Buchanan's Scottish History (just above 500,000) and performance wasn't noticeably slower. It took a couple of seconds to "slide" across 100 pages.

Performance is noticeably slower for the ~ 100 texts that are as long as or longer than Gerard's Herbal (900,000) words. Page turning becomes a matter of five or more seconds.

I don't think it would be very difficult to flag the very long texts and display that flag as part of the summery information that the reader first sees. Could (or should) we retrieve the wordcount figure from the xenoData and display it in the Browse window?

Apart from the problem of large files, there are the vagaries of the Internet Archive's IIIF server. So we may want to have some general warning that we have no control over the retrieval time of any image.

I don't know whether or how fast performance degrades with concurrent users

craigberry commented 4 years ago

The word and page counts are already displayed under the More button. The text filter permits selecting by a range of page counts and/or a range of word counts, and the sort option allows you to sort by page count or word count. We should probably document somewhere that longer texts are slower, but I'm not sure where or exactly what to say.

On Dec 2, 2019, at 8:59 PM, martinmueller39 notifications@github.com wrote:

When you open a large file in oXygen, it gives you a warning that performnce may slow down. oXygen also can disable certain functions for very large files, but that's irrelevant to our purpose.

Would it be helpful add warnings for large files? There are not quite 1,000 texts longer than 250,000 words (Hobbes' Leviathan). 300,000 words add up to about 20MB when the file is annotated. Performance with files of that size is perfectly adequate, if not lightning fast.

There are ~ 300 files with 500,000 or more words. I leafed through pages in Buchanan's Scottish History (just above 500,000) and performance wasn't noticeably slower. It took a couple of seconds to "slide" across 100 pages.

Performance is noticeably slower for the ~ 100 texts that are as long as or longer than Gerard's Herbal (900,000) words. Page turning becomes a matter of five or more seconds.

I don't think it would be very difficult to flag the very long texts and display that flag as part of the summery information that the reader first sees. Could (or should) we retrieve the wordcount figure from the xenoData and display it in the Browse window?

Apart from the problem of large files, there are the vagaries of the Internet Archive's IIIF server. So we may want to have some general warning that we have no control over the retrieval time of any image.

I don't know whether or how fast performance degrades with concurrent users

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.


Craig A. Berry

"... getting out of a sonnet is much more difficult than getting in." Brad Leithauser

martinmueller39 commented 4 years ago

So they are. Silly me. The simplest and most effective thing would be to flag the 100 longest texts and put something like “very long text” right at the top and before you click the MORE button. As for digital combos, the More section might have a boilerplate sentence such as “the images for this text come from an IIIF server at the Internet Archive, and the retrieval time is a function of traffic”.

From: "Craig A. Berry" notifications@github.com Reply-To: "earlyprint/earlyprint.github.io" reply@reply.github.com Date: Monday, December 2, 2019 at 9:06 PM To: "earlyprint/earlyprint.github.io" earlyprint.github.io@noreply.github.com Cc: Martin Mueller martinmueller@northwestern.edu, Author author@noreply.github.com Subject: Re: [earlyprint/earlyprint.github.io] performance warning? (#25)

The word and page counts are already displayed under the More button. The text filter permits selecting by a range of page counts and/or a range of word counts, and the sort option allows you to sort by page count or word count. We should probably document somewhere that longer texts are slower, but I'm not sure where or exactly what to say.

On Dec 2, 2019, at 8:59 PM, martinmueller39 notifications@github.com wrote:

When you open a large file in oXygen, it gives you a warning that performnce may slow down. oXygen also can disable certain functions for very large files, but that's irrelevant to our purpose.

Would it be helpful add warnings for large files? There are not quite 1,000 texts longer than 250,000 words (Hobbes' Leviathan). 300,000 words add up to about 20MB when the file is annotated. Performance with files of that size is perfectly adequate, if not lightning fast.

There are ~ 300 files with 500,000 or more words. I leafed through pages in Buchanan's Scottish History (just above 500,000) and performance wasn't noticeably slower. It took a couple of seconds to "slide" across 100 pages.

Performance is noticeably slower for the ~ 100 texts that are as long as or longer than Gerard's Herbal (900,000) words. Page turning becomes a matter of five or more seconds.

I don't think it would be very difficult to flag the very long texts and display that flag as part of the summery information that the reader first sees. Could (or should) we retrieve the wordcount figure from the xenoData and display it in the Browse window?

Apart from the problem of large files, there are the vagaries of the Internet Archive's IIIF server. So we may want to have some general warning that we have no control over the retrieval time of any image.

I don't know whether or how fast performance degrades with concurrent users

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.


Craig A. Berry

"... getting out of a sonnet is much more difficult than getting in." Brad Leithauser

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_earlyprint_earlyprint.github.io_issues_25-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DABL7UL44QVNKEIFLHMTX5YTQWXEI7A5CNFSM4JUQ7HJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFX5OYY-23issuecomment-2D560977763&d=DwMFaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=rG8zxOdssqSzDRz4x1GLlmLOW60xyVXydxwnJZpkxbk&m=PTKPqTNTfADR70AKjNXWoP_ZF_YuiOxKabLqaIxBmNU&s=58ROUWaAX3gqrXXgpUNecptjNiKIe53sgUb7MOgf6to&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABL7UL7OH76TLBKAXGYJMBLQWXEI7ANCNFSM4JUQ7HJQ&d=DwMFaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=rG8zxOdssqSzDRz4x1GLlmLOW60xyVXydxwnJZpkxbk&m=PTKPqTNTfADR70AKjNXWoP_ZF_YuiOxKabLqaIxBmNU&s=Vc-Vlne7mlZ1vLMniZHlKisj7jjacPRb_xR2Yc9kUdo&e=.