trufanov-nok / minidjvu-mod

A multipage DjVu encoder. This is a fork of minidjvu, with full-scale shared dictionaries (djbz) optimization and a few tricks in order to compensate the subsequent performance drop (multi-threading etc.).
GNU General Public License v3.0
15 stars 6 forks source link

Can't get minidjvu_mod to work #2

Closed maple7-7-7 closed 3 years ago

maple7-7-7 commented 4 years ago

Hi again, Alexander,

I love the possibility of a reduced DjVu file size using your new fork.

But I keep having problems installing.

I tried to install on two computers and am getting more error messages.

I tried your corrections and still no luck.

I love experimenting with DjVu, including annotations, but I have little Linux programming.experience.

I have a number of pbm to DjVu projects that I would love to test for you and me, in terms of changes in file size using the fork for a given set of DjVu parameters.

It would be revealing, for instance, to compare minidjvu with your minidjvu fork and with any2djvu and DjVu Solo 3.1, for a set number of dictionary pages. The program DjVuToy is also useful for comparing possibilities.

I have several self-made DjVu documents derived from pbms and bmps, some hundreds of pages long. It would be interesting to use your new program with these.

Is there a way for you to create a version of the minidjvu fork that will install as easily as minidjvu itself? Then maybe I can give you lots of resulting data regarding the efficacy of your new program at reducing DjVu file size.

I have tinkered a lot with DjVu, including with annotations, but I am no programmer.

I could, if necessary, remove minidjvu and then test the fork after a new fork install.

Thanks again for your great work and for your interesting discoveries about encoding DjVu.

Stephen Jones

trufanov-nok commented 3 years ago

Hi Stephen! I'm very sorry, but I didn't get any notifications about your issues for some reason... Perhaps I need to add myself to watchers... that's strange.

I think I need to add some Installation notes to the project. Also I would like to create a .deb package for it and upload to my PPA on launchpad so Linux users can get it and updates more easily. I think it will take a day or too... What Linux systems and versions you're using?

Currently there are only a windows binaries that I prepared for tests. They are uploaded to Releases page: https://github.com/trufanov-nok/minidjvu_mod/releases/tag/perf_test

I think you need to start with minidjvu-distr-mingw64-multithread_0.9a1.rar. Also there is a simple GUI made for Win users. The binaries are on Release page too, the GUI sources are here.

maple7-7-7 commented 3 years ago

Hi Alex!

Wow . . I couldn't believe it when I saw your notification!

Thank you so much. And here in Toronto it is like 3 in the morning.

Only in the last two weeks have I given thought again to creating the smallest-sized DjVus for a given set of input pages. Now that the days are shorter and it is colder out, I am indoors more.

About a week ago I was thinking of asking Leon if he would be interested in helping me come up with a functional version (for me) of your program to work with!

So this update from you is really exciting. I am really just hoping to install a ready-to-go version of minidjvu that reduces the file size by the roughly 30% you have indicated. Hopefully, it would install as easily as the regular version. For me, though, I have kept running into installation challenges. I am not a programmer, but I do like to tweak already-installed programs a bit.

In Linux, I find that installing a deb version of a program/package works best for me, as it is nicely automated in most cases.

I have two older computers that use Lubuntu, a lightweight version of Ubuntu. The regular version of minidjvu 0.8 works well with it. I think they are 14.04 32-bit.

I have Win 10 also, on a desktop purchased new this year.

I will try the links shortly.

Thanks again so much, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 6, 2020 5:30 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi Stephen! I'm very sorry, but I didn't get any notifications about your issues for some reason... Perhaps I need to add myself to watchers... that's strange.

I think I need to add some Installation notes to the project. Also I would like to create a .deb package for it and upload to my PPA on launchpad so Linux users can get it and updates more easily. I think it will take a day or too... What Linux systems and versions you're using?

Currently there are only a windows binaries that I prepared for tests. They are uploaded to Releases page: https://github.com/trufanov-nok/minidjvu_mod/releases/tag/perf_test

I think you need to start with minidjvu-distr-mingw64-multithread_0.9a1.rar. Also there is a simple GUI made for Win users. The binaries are on Release page too, the GUI sources are herehttps://github.com/trufanov-nok/minidjvu_mod_gui.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-739457909, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDSFZRZ5QPLTG4COKSTSTMJHXANCNFSM4NFYFATQ.

maple7-7-7 commented 3 years ago

Hi, (some tests)

I spent a bit of time creating some DjVus using two 10-page sets of PBMs. One set is called "mixed pages" and has title pages and text pages; the other set is called "full pages" and has all text pages. Each PBM image has an added-in margin for a more rounded look. It also allows one to test how good the minidjvu_mod program is in a different way than characters only.

All pages are 4250x5500 pixels and were pre-processed PBMs that had an auto-gamma feature added to make the characters darker as PBMs. I ran the sets through your Win MiniDjVu_mod and then through the Any2DjVu Server. In general, the results, quality-wise, are about equal, and the MiniDjVu_mod DjVus are only about 1-3% larger than the Any2DjVus. So this is a big improvement over minidjvu. If I remember rightly, regular minidjvus were about 10% larger than Any2DjVus, but I am not sure. I do not recall the file sizes of the two programs being this close. One good exception is that when encoded at 500 dpi in MiniDjVu_mod, the file was slightly smaller than either of the two Any2DjVus using 300dpi and 600dpi web options. The 4250x5500 is really meant for 500dpi.

Speedwise, the encoding with MiniDjVu_mod took at most 3 seconds, although that is on the newish Desktop computer.

Note that I tried the 3 compression levels in MiniDjVu_mod, and I got the same DjVu file sizes for the same input.

Other notes: The Any2DjVu server does not seem to accept files over 30 MB. Each uncompressed PBM was around 2.8 MB, so I compressed them using the Ultra compression option in 7Zip. This gave excellent compression to about 1 MB for the 10-page sets.

Thanks again, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 6, 2020 5:30 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi Stephen! I'm very sorry, but I didn't get any notifications about your issues for some reason... Perhaps I need to add myself to watchers... that's strange.

I think I need to add some Installation notes to the project. Also I would like to create a .deb package for it and upload to my PPA on launchpad so Linux users can get it and updates more easily. I think it will take a day or too... What Linux systems and versions you're using?

Currently there are only a windows binaries that I prepared for tests. They are uploaded to Releases page: https://github.com/trufanov-nok/minidjvu_mod/releases/tag/perf_test

I think you need to start with minidjvu-distr-mingw64-multithread_0.9a1.rar. Also there is a simple GUI made for Win users. The binaries are on Release page too, the GUI sources are herehttps://github.com/trufanov-nok/minidjvu_mod_gui.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-739457909, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDSFZRZ5QPLTG4COKSTSTMJHXANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Hi, I've managed to get sources autobuild on ubuntu's server for trusty (14.04) and now they are available in my PPA. You can get the deb from 14.04 by adding the PPA into your system:

sudo add-apt-repository ppa:truf/minidjvu-mod-temporary
sudo apt-get update

And then sudo apt install minidjvu-mod. You're supposed to get package updates after that as usual.

Note: this is a temporary PPA as once I'm going to integrate DjVu creation in ScanTailor Universal based on minidjvu-mod and thus move its binaries to scantailor's PPA as it will depend on them. Today I came back to minidjvu-mod repository to add a commands to assign Djbz dictionaries to encoding pages individually. Its barely useful from command line but could be interesting to play via STU. For ex, not allowing empty pages or pages with small number of characters to waste a space in pages_per_dict queue.

As for

In general, the results, quality-wise, are about equal, and the MiniDjVu_mod DjVus are only about 1-3% larger than the Any2DjVus.

If it's true that

Another solution is provided by the compression server at (http://any2djvu.djvuzone.org). This machine uses pre-lizardtech prototype encoders from AT&T Labs and performs almost as well as the commercial Lizardtech encoders.

Then I'm fine with that. Although I believe minidjvu_mod can achieve better results. In fact I've got a much better results on bigger test material but found out that was a "dirty" experiment bcs all input image sets were tiff files preprocessed by ScanTailor and it seems encoder significantly benefits from ScanTailors' smoothing which is enabled by default. That's another reason to integrate them together. And as ST is basically a natural manual image segmentator it'll allow to keep illustrations quality untouched and many other interesting things in scope of color text encoding etc.

Speedwise, the encoding with MiniDjVu_mod took at most 3 seconds, although that is on the newish Desktop computer.

Linux binaries supposed to outperform windows binaries for some reason (perhaps i'm using old msvc compiler to keep it compatible with WinXP, or just can't pick up right optimization flags). So there is a hope that old machine won't be times slower.

Note that I tried the 3 compression levels in MiniDjVu_mod, and I got the same DjVu file sizes for the same input.

You mean setting different compression levels affects nothing? That's weird. Are they affect encoding time? If no then I suppose something is broken and max compression is used by default.

maple7-7-7 commented 3 years ago

Hi again Alex!

Thanks for all the new information.

I have looked at ScanTailor briefly in the past. But I have not played with it.

I will get the Linux versions of minidjvu_mod soon. Thank you!

I will also look at STU, which I did not know about.

With regard to even better reductions in file size, whether it is through ScanTailor or some other approach, the main thing to me is the file size reduction itself, the final product. I think it will be really exciting to get ST to combine its file-reducing smoothing function, which I did not know about, with the minidjvu_mod encoder and maybe see the magic of file sizes considerably smaller than the commercial version(s).

This whole project reminds me of the 1973 Triple Crown winner Secretariat. In the final race of the 3 races, the Belmont Stakes, this horse won by 31 lengths, and it was the longest distance of the 3 races. Commentators were saying before the race things like, "Let's see what this horse can really do."

I think we are now ready to see what DjVu can really do, and now you are working with the dictionary pages in interesting ways, too. I also thought it would be really cool to create a dictionary that works with background chunks. For instance, you might have a repeating basic background color image, or a repeating parchment look.

With regard to the 3 different compression levels, using -C 1, -C 2, and -C 3, yes, they all gave the same number of bytes in the output DjVus with the same set of PBMs and all other parameters constant. They all processed in about the same 2-3 seconds, and all the intermediate notifications of the processing were the same.

Note that the details in the Win 10 command prompt for minidjvu_mod still say to enter minidjvu [options] [input file] (etc).

But actually I had to enter minidjvu_mod [options] [input file] (etc) to get the program to run.

Thanks again, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 6, 2020 8:12 PM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi, I've managed to get sources autobuild on ubuntu's server for trusty (14.04) and now they are available in my PPA. You can get the deb from 14.04 by adding the PPA into your system:

sudo add-apt-repository ppa:truf/minidjvu-mod-temporary sudo apt-get update

And then sudo apt install minidjvu-mod. You're supposed to get package updates after that as usual.

Note: this is a temporary PPA as once I'm going to integrate DjVu creation in ScanTailor Universal based on minidjvu-mod and thus move its binaries to scantailor's PPA as it will depend on them. Today I came back to minidjvu-mod repository to add a commands to assign Djbz dictionaries to encoding pages individually. Its barely useful from command line but could be interesting to play via STU. For ex, not allowing empty pages or pages with small number of characters to waste a space in pages_per_dict queue.

As for

In general, the results, quality-wise, are about equal, and the MiniDjVu_mod DjVus are only about 1-3% larger than the Any2DjVus.

If it's true that

Another solution is provided by the compression server at (http://any2djvu.djvuzone.org). This machine uses pre-lizardtech prototype encoders from AT&T Labs and performs almost as well as the commercial Lizardtech encoders.

Then I'm fine with that. Although I believe minidjvu_mod can achieve better results. In fact I've got a much better results on bigger test material but found out that was a "dirty" experiment bcs all input image sets were tiff files preprocessed by ScanTailor and it seems encoder significantly benefits from ScanTailors' smoothing which is enabled by default. That's another reason to integrate them together. And as ST is basically a natural manual image segmentator it'll allow to keep illustrations quality untouched and many other interesting things in scope of color text encoding etc.

Speedwise, the encoding with MiniDjVu_mod took at most 3 seconds, although that is on the newish Desktop computer.

Linux binaries supposed to outperform windows binaries for some reason (perhaps i'm using old msvc compiler to keep it compatible with WinXP, or just can't pick up right optimization flags). So there is a hope that old machine won't be times slower.

Note that I tried the 3 compression levels in MiniDjVu_mod, and I got the same DjVu file sizes for the same input. You mean setting different compression levels affects nothing? That's weird. Are they affect encoding time? If no then I suppose something is broken and max compression is used by default.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-739556683, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDRJ47JFFAHZYJJ2SJ3STPQT3ANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Hi,

I've updated the GUI (minidjvu-mod-gui_win64-v0.2.rar): the problem with incorrect classifier should be fixed now. I couldn't test a win32 ver, but win64 should work.

I also thought it would be really cool to create a dictionary that works with background chunks. For instance, you might have a repeating basic background color image, or a repeating parchment look.

The nuance here is that DjVu is a standard (I think current revision is 22 or 25). And encoders should follow the standard and even more - should respect the decoder implementation, especially DjVuLibre's as most readers just use it instead of making their own decoders. Otherwise you'll get a smallest document in the world which no any bookreader app could open and render to screen properly. So, I would say it's interesting to experiment but such experiments require some compatibility assessment with djvu specification and djvu iconic implementation. And in case it's not fit into standard and its implementations then even if the proposed enhancement is feasible it'll take years to roll out to users practice.

But actually I had to enter minidjvu_mod

It's fixed but not build for tests yet. Note: I already renamed some binaries from "minidjvumod" to "minidjvu-mod" as Debian doesn't like "" in package names. Sooner or later whole project will be renamed.

maple7-7-7 commented 3 years ago

sudo add-apt-repository ppa:truf/minidjvu-mod-temporary

E: The repository 'http://security.ubuntu.com/ubuntu cosmic-security Release' no longer has a Release file. N: Updating from such a repository can't be done securely, and is therefore disabled by default.

Hi Alex,

I could not get the deb file.

But the multi-threading version of minidjvu-mod that I am using in Win10 is more than good enough for the encoding.

I use a Linux machine to use the Internet, but I do most of my work on the newer Windows desktop, so there is no rush for me on the deb.

Take care, Stephen


From: Stephen Jones forsej1@outlook.com Sent: December 9, 2020 1:28 AM To: trufanov-nok/minidjvu_mod reply@reply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi Alex,

Thank you for updating the GUI and for the details about the challenges of adding a "dictionary" to use with background chunks.

I wish to correct a significant mistake I made and apologize.

I said I was using Any2DjVu as the go-to application for my best DjVus. That is true for 300 dpi, 400 dpi, and 600 dpi-related DjVus. But Any2DjVu does not offer a 500 dpi option, which is what our test documents are using.

The main test document is 4250x5500 at 500 dpi. So I checked back to when I made the DjVu last year, and it turns out that I did not use Any2DjVu for that, as the site does not offer a >10 pg dictionary for PBMs, or a 500 dpi option. Submitting a PBM-based PDF can give a 20 pg dictionary at Any2DjVu, but it is problematic when trying to do it at 500dpi.

It turned out that for 500dpi DjVus, I was using the program DjVuToy, which can offer up to a 20pg dictionary, but this only after converting the auto-gamma'd PBMs to BMPs using IrfanView.

DjVuToy does not accept PBMs or PDFs. So it has been an interesting exercise in navigation.

The output from DjVuToy in general is about equal in file size to the output at Any2DjVu. I have an old license I purchased from LizardTech to run a DjVu Virtual Printer, so I only feel slightly guilty using DjVuToy, which I think is using LizardTech encoders, but I am not sure. The commercial Printer Version is actually very slow and sometimes gives strange results. I feel I have paid for the right to generate high-quality commercial-level Sjbz DjVus. Now that you have produced an equivalent in minidjvu-mod, I thank you that I can avoid this grey area.

The test document uses the very legible font Century Schoolbook L, one of the free fonts for Linux. It was specifically released to Linux by the printing foundry itself. Many children of the 1950s and 1960s in North America would be familiar with this font look, because the original font, Century Schoolbook, was used in many of their early reading schoolbooks.

I added a rounded-rectangle style margin to the pages by inserting a PBM image made in GIMP into the original Word document. Ideally, DjVu pages with this margin look a bit like a tablet when viewed in a DjVu viewer with a black background. Recent viewers seem to be going with grey backgrounds. The added margin means that the page file sizes are a few percent higher than without it. Adding the auto-gamma function to the PBMs also increases file size similarly.

One of the tests of your minidjvu-mod fidelity to the originals is to check that there is a 10-pixel margin along the sides. This is seen.

Why 500 dpi, and not 300, 400 or 600? (Facts and Opinions)

Making 500 dpi is more work, but

300 dpi is not crisp enough, lots of funny-looking characters zoomed. 400 dpi is sharper, but the periods and dots are not very symmetrical. 500 dpi is nice in that the periods and dots are symmetrical octagons. Unless zoomed, this is not noticeable and they look round. 500 dpi is the first higher resolution that looks high-quality. 600 dpi is too crisp, and the dots and periods are not as symmetrical. 600 dpi is also less legible than 500, in that it is a lighter weight font than 500 dpi. It gives significantly larger file sizes, with little gained, if anything, over the 500 dpi.

I think of 500 dpi as the sweet spot.

A number of archive.org DjVus were 500 dpi, which was initially a surprise to me.

Here are some results regarding experimenting further with your installed file minidjvu-distr-mingw64-multithread_0.9a1.rar and also comparing the results with DjVuToy results.

These results were obtained before receiving your newest email.

Other Notes and Observations first:

If sending a PBM zipped file to Any2Djvu, the dictionaries are 10pg.

If sending a PDF made from PBMs, dictionaries are 20pg.

Any2DjVu does not seem to process any files at or above 30 MB.

Workflow for Pages, using minidjvu-mod:

docx --> pdf --> split pdfs --> auto-gamma'd pbms --> DjVus

Note: I am using a multi-core desktop and I think there are at least 8 cores. There are 16 GB of RAM, of which 2 GB are VRAM. The computer and CPU are considered a high-end budget type. Not a high-powered gaming computer, but handles video editing tasks fairly quickly.

The full test document, of which I initially sent you minidjvu-mod results for 10 pages of 4250x5500 at dpi 500 and -p 10 (default), is 903 pages. Each page is input into minidjvu-mod as an auto-gamma'd PBM.

I wished to see how well and how quickly minidjvu-mod would encode the entire document, and also test various dictionary page sizes.

Because the command prompt in Win10 would not accept the number of characters needed to add in bulk all 903 PBM filenames to the screen, I split the project into a page 001-500.djvu and a 501-903.djvu.

I tested dictionary page sizes of 10, 20, 50, 75, 100, and 150.

I really don't like going much above 50, but I set the preferred standard at 100, and just did 150 for fun. I worry, but without evidence, that larger page numbers per dictionary might slow down the speed of jumping from one section of a big DjVu document to another. I don't worry so much about how much longer the encoding itself might take, but I do like to see what that is, too.

Results in general for each encoded minidjvu-mod DjVu (400-500 pgs):

Quality of document is like that of the DjVuToy document. Excellent.

DjVuToy quality is like that of Any2DjVu.

Encoding speeds (multithreading):

10 pg dict and 20 pg dict: both about 20 sec (very fast for 903 pgs)

50 pg dict about 40 sec

100 pg dict about 1 min

150 pg dict about 1.5 - 2 min

These are estimates just counting on my own. The point is: Very Fast!

I can recall much less demanding projects for minidjvu, but on slower computers, of course, taking hours.

The maximum cache or memory use was around 4-5 GB for the big dictionaries, and more like 0.5 GB for the smaller ones.

Pages per Dictionary and DjVu File Sizes First 500 pgs no OCR

10p 1624 KB 20p 1534 KB 50p 1459 KB 75p 1435 KB 100p 1423 KB 150p 1415 KB

The 3 attached DjVus:

001 MiniDjVu-mod DjVu 500 pgs 10 pg dictionary no OCR 002 MiniDjVu-mod DjVu 500 pgs 100 pg dictionary no OCR 003 DjVuToy DjVu 903 pgs with FGbzs, BG44s and simplified OCR 20 pg dictionary with chunks added using DjVuLibre

Simplified OCR leaves out the "Line" lines in the added dsed.

I actually don't really like the BG, but just experimenting.

To me, the pages in B & W mode look just like your minidjvu-mod ones.

Calculations indicate that the larger dictionaries and being able to set the dpi at 500 to fit the pixel array make your DjVus a few percentage points smaller than the LizardTech ones!

Our example: See DjVu 003 (from DjVuToy + DjVuLibre to add bgs, etc) DjVuToy Sjbzs + 20pg dicts + DjVuLibre FG/BG chunks + OCR = 4.77 MB.

All parameters equal except dictionary size and encoder source: Your version is calculated to be 4.66 MB! About 2.5% smaller!

According to the DjVuLIbre literature, the C44 functions as well as the commercial C44. So, in my view, you have now created a DjVu program that allows anyone with some extra knowledge to make commercial-quality DjVus!

Great Work! A Milestone!

Thank you so much.

Stephen


From: Alexander Trufanov notifications@github.com Sent: December 8, 2020 10:37 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi,

I've updated the GUIhttps://github.com/trufanov-nok/minidjvu_mod/releases/tag/perf_test (minidjvu-mod-gui_win64-v0.2.rar): the problem with incorrect classifier should be fixed now. I couldn't test a win32 ver, but win64 should work.

I also thought it would be really cool to create a dictionary that works with background chunks. For instance, you might have a repeating basic background color image, or a repeating parchment look.

The nuance here is that DjVu is a standard (I think current revision is 22 or 25). And encoders should follow the standard and even more - should respect the decoder implementation, especially DjVuLibre's as most readers just use it instead of making their own decoders. Otherwise you'll get a smallest document in the world which no any bookreader app could open and render to screen properly. So, I would say it's interesting to experiment but such experiments require some compatibility assessment with djvu specification and djvu iconic implementation. And in case it's not fit into standard and its implementations then even if the proposed enhancement is feasible it'll take years to roll out to users practice.

But actually I had to enter minidjvu_mod

It's fixed but not build for tests yet. Note: I already renamed some binaries from "minidjvumod" to "minidjvu-mod" as Debian doesn't like "" in package names. Sooner or later whole project will be renamed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-740537480, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDWP6YYS7IB4MWNWQ5DSTX6VFANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Hi, Regarding

E: The repository 'http://security.ubuntu.com/ubuntu cosmic-security Release' no longer has a Release file. N: Updating from such a repository can't be done securely, and is therefore disabled by default.

Just ignore it and execute sudo apt install minidjvu-mod
This error isn't about my PPA. When you performed add-apt-repository it added my PPA url to the list (which is stored in /etc/apt/sources.list) and then automatically started apt update to update the database of packages available from these URLs. One of the URLs in this list was a http://security.ubuntu.com/ubuntu cosmic-security Release which isn't valid for your system anymore for some reason. So you got this error, but if that's all then all other URLs are processed fine and now your apt know about all packages available via my URL. So you can just try to install it with sudo apt install minidjvu-mod.

You will see this error every time you manually launch sudo apt update unless you comment out this URL in /etc/apt/sources.list or execute sudo add-apt-repository --remove "http://security.ubuntu.com/ubuntu cosmic-security Release" or delete it via some GUI like settings of Muon which is available on KDE based systems.

maple7-7-7 commented 3 years ago

Hi Alex,

Thanks for your detailed help.

I tried your

sudo add-apt-repository --remove "http://security.ubuntu.com/ubuntu cosmic-security Release"

and it went through, but everything else, when re-tried, has stayed the same.

It should be noted that I also can no longer use the regular Lubuntu minidjvu program. This part actually happened after I got error messages when trying to install minidjvu_mod a few months ago.

I get this when trying to re-access minidjvu:

minidjvu: symbol lookup error: minidjvu: undefined symbol: mdjvu_set_classify_options

I guess I am just uncomfortable now with using my Lubuntu in its current state for installing minidjvu-related programs and would rather do a fresh install of Lubuntu before trying the new stuff. But I also use the same computer to access the Internet, so I have to plan this out a bit.

I could maybe try to make a dual-boot machine with an older WinXP laptop I have.

But again I appreciate all the details and will keep them for possible future use. I don't want to keep asking you about things when now it could be totally my end that needs some work.

Thoughts? Then done for now? Stephen


From: Alexander Trufanov notifications@github.com Sent: December 9, 2020 6:55 PM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi, Regarding

E: The repository 'http://security.ubuntu.com/ubuntu cosmic-security Release' no longer has a Release file. N: Updating from such a repository can't be done securely, and is therefore disabled by default.

Just ignore it and execute sudo apt install minidjvu-mod This error isn't about my PPA. When you performed add-apt-repository it added my PPA url to the list (which is stored in /etc/apt/sources.list) and then automatically started apt update to update the database of packages available from these URLs. One of the URLs in this list was a http://security.ubuntu.com/ubuntu cosmic-security Release which isn't valid for your system anymore for some reason. So you got this error, but if that's all then all other URLs are processed fine and now your apt know about all packages available via my URL. So you can just try to install it with sudo apt install minidjvu-mod.

You will see this error every time you manually launch sudo apt update unless you comment out this URL in /etc/apt/sources.list or execute sudo add-apt-repository --remove "http://security.ubuntu.com/ubuntu cosmic-security Release" or delete it via some GUI like settings of Muon which is available on KDE based systems.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-741978719, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDQHAUYLG6P6CEN77KTST7B3ZANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Well, first of all you shall remove the minidjvu-mod files that might be installed in your system at /usr/local/bin/ or /usr/local/lib/. By default the apps that are compiled by the user are installed in /usr/local/*. On the other hand the apps that arrived from repository and managed by package managers usually installed to /usr/bin/ or /usr/lib/. And in case you had minidjvu from repository at /usr/bin/ and some pieces of locally compiled minidjvu-mod which might not be properly renamed... They can overlap and conflict. So just:

sudo rm /usr/local/bin/minidjvu*
sudo rm /usr/local/lib/x86_64-linux-gnu/libminidjvu*
sudo rm /usr/bin/minidjvu*
sudo rm /usr/lib/x86_64-linux-gnu/libminidjvu*

Then reinstall minidjvu and its library from repos:

sudo apt remove minidjvu libminidjvu0
sudo install minidjvu

While installation minidjvu package will install its library package automatically. At this point you must have original minidjvu installed which may be checked by executing minidjvu. It'll print a help.

After that perform sudo apt install minidjvu-mod and send me the output. It must install it or say that you already have it. You can check if the app is installed by execution of minidjvu-mod. Check that both minidjvu and minidjvu-mod may be launched and not conflict.
Note: you may also use command which to get the file and path there each application is stored. Like:

$ which minidjvu
/usr/bin/minidjvu

In case apt says that there is no minidjvu-mod package for your system then send me the output of: cat /etc/issue.net. It'll print your system version. As you had cosmic-security in your buggy URL I doubt you are on 14.04 system. cosmic series represents 18.10 while 14.04 should be trusty.

maple7-7-7 commented 3 years ago

Okay, here goes . . .

sudo rm /usr/local/bin/minidjvu sudo rm /usr/local/lib/x86_64-linux-gnu/libminidjvu sudo rm /usr/bin/minidjvu sudo rm /usr/lib/x86_64-linux-gnu/libminidjvu

Lines 1 and 3 were good. Lines 2 and 4 - No such file or directory

Successfully installed minidjvu and it runs.

Tried to install minidjvu-mod

Unable to locate package mindjvu-mod

You are right, and I am wrong again . . .

Ubuntu 18.10, although I use the Lubuntu version. Sorry.


From: Alexander Trufanov notifications@github.com Sent: December 9, 2020 8:21 PM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Well, first of all you shall remove the minidjvu-mod files that might be installed in your system at /usr/local/bin/ or /usr/local/lib/. By default the apps that are compiled by the user are installed in /usr/local/*. On the other hand the apps that arrived from repository and managed by package managers usually installed to /usr/bin/ or /usr/lib/. And in case you had minidjvu from repository at /usr/bin/ and some pieces of locally compiled minidjvu-mod which might not be properly renamed... They can overlap and conflict. So just:

sudo rm /usr/local/bin/minidjvu sudo rm /usr/local/lib/x86_64-linux-gnu/libminidjvu sudo rm /usr/bin/minidjvu sudo rm /usr/lib/x86_64-linux-gnu/libminidjvu

Then reinstall minidjvu and its library from repos:

sudo apt remove minidjvu libminidjvu0 sudo install minidjvu

While installation minidjvu package will install its library package automatically. At this point you must have original minidjvu installed which may be checked by executing minidjvu. It'll print a help.

After that perform sudo apt install minidjvu-mod and send me the output. It must install it or say that you already have it. You can check if the app is installed by execution of minidjvu-mod. Check that both minidjvu and minidjvu-mod may be launched and not conflict. Note: you may also use command which to get the file and path there each application is stored. Like:

$ which minidjvu /usr/bin/minidjvu

In case apt says that there is no minidjvu-mod package for your system then send me the output of: cat /etc/issue.net. It'll print your system version. As you had cosmic-security in your buggy URL I doubt you are on 14.04 system. cosmic series represents 18.10 while 14.04 should be trusty.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742024964, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDSFLZ5Y6WVFMWCO7RTST7L3BANCNFSM4NFYFATQ.

maple7-7-7 commented 3 years ago

Hi again,

I located the old laptop with Lubuntu - Ubuntu 14.04.2. LTS

Still says can't find the program minidjvu-mod

This is before and after the update function.

I re-installed minidjvu. It works and is 0.8.

Stephen


From: Alexander Trufanov notifications@github.com Sent: December 9, 2020 8:21 PM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Well, first of all you shall remove the minidjvu-mod files that might be installed in your system at /usr/local/bin/ or /usr/local/lib/. By default the apps that are compiled by the user are installed in /usr/local/*. On the other hand the apps that arrived from repository and managed by package managers usually installed to /usr/bin/ or /usr/lib/. And in case you had minidjvu from repository at /usr/bin/ and some pieces of locally compiled minidjvu-mod which might not be properly renamed... They can overlap and conflict. So just:

sudo rm /usr/local/bin/minidjvu sudo rm /usr/local/lib/x86_64-linux-gnu/libminidjvu sudo rm /usr/bin/minidjvu sudo rm /usr/lib/x86_64-linux-gnu/libminidjvu

Then reinstall minidjvu and its library from repos:

sudo apt remove minidjvu libminidjvu0 sudo install minidjvu

While installation minidjvu package will install its library package automatically. At this point you must have original minidjvu installed which may be checked by executing minidjvu. It'll print a help.

After that perform sudo apt install minidjvu-mod and send me the output. It must install it or say that you already have it. You can check if the app is installed by execution of minidjvu-mod. Check that both minidjvu and minidjvu-mod may be launched and not conflict. Note: you may also use command which to get the file and path there each application is stored. Like:

$ which minidjvu /usr/bin/minidjvu

In case apt says that there is no minidjvu-mod package for your system then send me the output of: cat /etc/issue.net. It'll print your system version. As you had cosmic-security in your buggy URL I doubt you are on 14.04 system. cosmic series represents 18.10 while 14.04 should be trusty.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742024964, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDSFLZ5Y6WVFMWCO7RTST7L3BANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Well, I'm new in PPA and it looks like I was completely wrong expecting that Ubuntu's Launchpad will allow me to distribute packages to any system I want. It seems to be designed for a few newest releases and a few old LTS distributions. And if I want anything apart that I need to create own repository and host it for ex. on GitHub pages. Perhaps I'll manage to create it but right now I would rather concentrate on app development.
So, I decide to refuse from package distribution via online repository for now and fallback to old way: just download and install deb files or build from sources.

Well, let's try to install minidjvu-mod in this old way. Just download deb packages.
For 14.04 32 bit try:

cd /tmp
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01_i386.deb
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01_i386.deb

sudo dpkg -i minidjvu-mod_0.9m01_i386.deb libminidjvu-mod0_0.9m01_i386.deb

For 14.04 64 bit try:

cd /tmp
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01_amd64.deb
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01_amd64.deb

sudo dpkg -i minidjvu-mod_0.9m01_amd64.deb libminidjvu-mod0_0.9m01_amd64.deb

For 18.10 64bit try:

cd /tmp
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01focal_amd64.deb
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01focal_amd64.deb

sudo dpkg -i libminidjvu-mod0_0.9m01focal_amd64.deb minidjvu-mod_0.9m01focal_amd64.deb

If it's Ok then you can try to install gui:

#14.04 32bit:
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_i386.deb
sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_i386.deb

#14.04 64bit:
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_amd64.deb
sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_amd64.deb

#18.10 64bit (i'm not sure if this one install. if not - let me know):
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_amd64.deb
sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_amd64.deb
maple7-7-7 commented 3 years ago

Thanks Alex!

My 14.04 laptop is 32-bit, so I can try to run your 32-bit programs.

My 18.10 laptop is also 32-bit, so I would also need to have both 32-bit programs in 18.10 as well.

Thanks so much, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 10, 2020 9:10 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Well, I'm new in PPA and it looks like I was completely wrong expecting that Ubuntu's Launchpad will allow me to distribute packages to any system I want. It seems to be designed for a few newest releases and a few old LTS distributions. And if I want anything apart that I need to create own repository and host it for ex. on GitHub pages. Perhaps I'll manage to create it but right now I would rather concentrate on app development. So, I decide to refuse from package distribution via online repository for now and fallback to old way: just download and install deb files or build from sources.

Well, let's try to install minidjvu-mod in this old way. Just download deb packages. For 14.04 32 bit try:

cd /tmp wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01_i386.deb wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01_i386.deb

sudo dpkg -i minidjvu-mod_0.9m01_i386.deb libminidjvu-mod0_0.9m01_i386.deb

For 14.04 64 bit try:

cd /tmp wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01_amd64.deb wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01_amd64.deb

sudo dpkg -i minidjvu-mod_0.9m01_amd64.deb libminidjvu-mod0_0.9m01_amd64.deb

For 18.10 64bit try:

cd /tmp wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01focal_amd64.deb wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01focal_amd64.deb

sudo dpkg -i libminidjvu-mod0_0.9m01focal_amd64.deb minidjvu-mod_0.9m01focal_amd64.deb

If it's Ok then you can try to install gui:

14.04 32bit:

wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_i386.deb sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_i386.deb

14.04 64bit:

wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_amd64.deb sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_amd64.deb

18.10 64bit (i'm not sure if this one install. if not - let me know):

wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_amd64.deb sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_amd64.deb

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742386920, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDUAV2RMOVMXZW7FYODSUCGAJANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

My 18.10 laptop is also 32-bit, so I would also need to have both 32-bit programs in 18.10 as well.

For 18.10 32bit should be:

cd /tmp
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01focal_i386.deb
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01focal_i386.deb

sudo dpkg -i libminidjvu-mod0_0.9m01focal_i386.deb minidjvu-mod_0.9m01focal_i386.deb

wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_i386.deb
sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_i386.deb
maple7-7-7 commented 3 years ago

Thanks again!


From: Alexander Trufanov notifications@github.com Sent: December 10, 2020 11:13 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

My 18.10 laptop is also 32-bit, so I would also need to have both 32-bit programs in 18.10 as well.

For 18.10 32bit should be:

cd /tmp wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01focal_i386.deb wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01focal_i386.deb

sudo dpkg -i libminidjvu-mod0_0.9m01focal_i386.deb minidjvu-mod_0.9m01focal_i386.deb

wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_i386.deb sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_i386.deb

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742455149, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDTEJEPKNHEKGR3UCKTSUCUPNANCNFSM4NFYFATQ.

maple7-7-7 commented 3 years ago

Working with 18.10 32-bit

dpkg: error processing package minidjvu-mod (--install): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libminidjvu-mod0: libminidjvu-mod0 depends on libjemalloc1 (>= 2.1.1); however: Package libjemalloc1 is not installed.

I checked and libjemalloc2 is installed. The 18.10 32-bit laptop does not support multi-threading.


From: Alexander Trufanov notifications@github.com Sent: December 10, 2020 11:13 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

My 18.10 laptop is also 32-bit, so I would also need to have both 32-bit programs in 18.10 as well.

For 18.10 32bit should be:

cd /tmp wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01focal_i386.deb wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01focal_i386.deb

sudo dpkg -i libminidjvu-mod0_0.9m01focal_i386.deb minidjvu-mod_0.9m01focal_i386.deb

wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2-0ubuntu1_i386.deb sudo dpkg -i minidjvu-mod-gui_0.2-0ubuntu1_i386.deb

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742455149, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDTEJEPKNHEKGR3UCKTSUCUPNANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

That's bcs you're trying to install minidjfu-mod with library from 14.04 instead of 18.10. Because you couldn't get a proper library - i've just realized that https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01focal_i386.deb doesn't exists on server and wget returns error 404. I built that package for 20.04 and looks like support of 32bit architecture was dropped in it. I need to resubmit the package for 18.04 to make launchpad build it for both 32 and 64bit. It will take a few hours. I'll send the proper URLs then.

maple7-7-7 commented 3 years ago

Ok.

Remember of course that I already do have a great Win64 multi- threading minidjvu_mod functioning in my newest computer, so there is no rush from me on this.

But of course it will be also be interesting for both of us to see how things run in Linux on older computers.

Would you believe I also got the 404 message, but decided to leave out the term "focal", and it installed (sort of). I have no clue what focal means, in fact I really don't know what you are doing most of the time as I am ignorant of Linux commands, but the term was sometimes in the command line and sometimes not, so I left it out for "consistency." (!) (?)

Have some fun with this:

stephen@stephen-pc:/tmp$ wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01focal_i386.deb --2020-12-10 07:02:49-- https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01focal_i386.deb Resolving launchpad.net (launchpad.net)... 91.189.89.223, 91.189.89.222, 2001:67c:1560:8003::8004, ... Connecting to launchpad.net (launchpad.net)|91.189.89.223|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2020-12-10 07:02:49 ERROR 404: Not Found.

Next: Deleting "focal" "because it was there!" - Ridiculous idea.

stephen@stephen-pc:/tmp$ wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01_i386.deb --2020-12-10 07:13:30-- https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01_i386.deb Resolving launchpad.net (launchpad.net)... 91.189.89.223, 91.189.89.222, 2001:67c:1560:8003::8004, ... Connecting to launchpad.net (launchpad.net)|91.189.89.223|:443... connected. HTTP request sent, awaiting response... 303 See Other Location: https://launchpadlibrarian.net/510105683/libminidjvu-mod0_0.9m01_i386.deb [following] --2020-12-10 07:13:30-- https://launchpadlibrarian.net/510105683/libminidjvu-mod0_0.9m01_i386.deb Resolving launchpadlibrarian.net (launchpadlibrarian.net)... 91.189.89.229, 91.189.89.228, 2001:67c:1560:8003::8008, ... Connecting to launchpadlibrarian.net (launchpadlibrarian.net)|91.189.89.229|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 57846 (56K) [application/x-debian-package] Saving to: ‘libminidjvu-mod0_0.9m01_i386.deb’

libminidjvu-mod0_0.9m01_i386. 100%[================================================>] 56.49K 296KB/s in 0.2s

2020-12-10 07:13:31 (296 KB/s) - ‘libminidjvu-mod0_0.9m01_i386.deb’ saved [57846/57846]

stephen@stephen-pc:/tmp$ wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01_i386.deb --2020-12-10 07:17:44-- https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01_i386.deb Resolving launchpad.net (launchpad.net)... 91.189.89.222, 91.189.89.223, 2001:67c:1560:8003::8003, ... Connecting to launchpad.net (launchpad.net)|91.189.89.222|:443... connected. HTTP request sent, awaiting response... 303 See Other Location: https://launchpadlibrarian.net/510105682/minidjvu-mod_0.9m01_i386.deb [following] --2020-12-10 07:17:45-- https://launchpadlibrarian.net/510105682/minidjvu-mod_0.9m01_i386.deb Resolving launchpadlibrarian.net (launchpadlibrarian.net)... 91.189.89.228, 91.189.89.229, 2001:67c:1560:8003::8007, ... Connecting to launchpadlibrarian.net (launchpadlibrarian.net)|91.189.89.228|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 23806 (23K) [application/x-debian-package] Saving to: ‘minidjvu-mod_0.9m01_i386.deb’

minidjvu-mod_0.9m01_i386.deb 100%[================================================>] 23.25K --.-KB/s in 0.09s

2020-12-10 07:17:45 (262 KB/s) - ‘minidjvu-mod_0.9m01_i386.deb’ saved [23806/23806]

stephen@stephen-pc:/tmp$ sudo dpkg -i minidjvu-mod_0.9m01_i386.deb libminidjvu-mod0_0.9m01_i386.deb [sudo] password for stephen: Selecting previously unselected package minidjvu-mod. (Reading database ... 341722 files and directories currently installed.) Preparing to unpack minidjvu-mod_0.9m01_i386.deb ... Unpacking minidjvu-mod (0.9m01) ... Selecting previously unselected package libminidjvu-mod0. Preparing to unpack libminidjvu-mod0_0.9m01_i386.deb ... Unpacking libminidjvu-mod0 (0.9m01) ... dpkg: dependency problems prevent configuration of minidjvu-mod: minidjvu-mod depends on libjemalloc1 (>= 2.1.1); however: Package libjemalloc1 is not installed.

dpkg: error processing package minidjvu-mod (--install): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of libminidjvu-mod0: libminidjvu-mod0 depends on libjemalloc1 (>= 2.1.1); however: Package libjemalloc1 is not installed.

dpkg: error processing package libminidjvu-mod0 (--install): dependency problems - leaving unconfigured Processing triggers for man-db (2.8.4-2) ... Errors were encountered while processing: minidjvu-mod libminidjvu-mod0 stephen@stephen-pc:/tmp$

A work in progress.


From: Alexander Trufanov notifications@github.com Sent: December 10, 2020 1:01 PM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

That's bcs you're trying to install minidjfu-mod with library from 14.04 instead of 18.10. Because you couldn't get a proper library - i've just realized that https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01focal_i386.deb doesn't exists on server and wget returns error 404. I built that package for 20.04 and looks like support of 32bit architecture was dropped in it. I need to resubmit the package for 18.04 to make launchpad build it for both 32 and 64bit. It will take a few hours. I'll send the proper URLs then.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742506818, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDRCF65NLDQNETT2AATSUDBDNANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Hm.. Looks like I can't build with libjemalloc2 for 32bit on launchpad. 18.04 provides only jemalloc1. 19.04 isn't available for building. 19.10 dropped 32bit support. Only whitelisted packages can be build for 32bit for 20.04.

But, it looks like there is 32bit jemalloc1 for cosmic (18.10) in compatibility archieves. And most important - it looks like jemalloc1 and jemalloc2 may be installed on the same machine without any conflicts between them. So my suggestion is just install jemalloc1 on your machine from archieves.

Try these:

cd /tmp
wget http://launchpadlibrarian.net/344123790/libjemalloc1_3.6.0-11_i386.deb
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01gbionic_i386.deb
wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01gbionic_i386.deb

sudo dpkg -i libjemalloc1_3.6.0-11_i386.deb libminidjvu-mod0_0.9m01gbionic_i386.deb minidjvu-mod_0.9m01gbionic_i386.deb

wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2gbionic_i386.deb

sudo dpkg -i minidjvu-mod-gui_0.2gbionic_i386.deb
maple7-7-7 commented 3 years ago

Hi Alex,

Great news! I was able to install your recommendations and so now we have a functioning minidjvu-mod program in Linux (Lubuntu 18.10 - 32 bit). Thanks so much! I really don't work much with Linux commands or know how to use them without help, except when I play with DjVuLibre.

The Linux laptop is a Toshiba Satellite with 3 GB RAM, a T2050 dual-core processor with multi-threading enabled in the BIOS. It is dual-boot -- Lubuntu and Win7.

The Win10 desktop is 16 GB RAM (2 GB VRAM), and maybe 8 cores.

Basic Results:

I ran tests using the same first set of 500 PBMs used for the Win64 multi-threading test document and compared the results.

I tested the following order of pages per dictionary: 10 pg 20 pg 50 pg 100 pg

Results: The 10pg, 20pg, and 50pg Linux DjVu versions all came out as the same size as the Windows DjVu versions and look great. The margins are accurate, and the characters look fine. Thanks!

Processing times were much slower, being a much less powerful computer, and it did not seem to be dual-threading, nor was I necessarily expecting that capability from the program at this point. I thought at first it was not a dual-core machine.

The processing times were from about 4:30 min increasing to about 5:30 min, which isn't bad at all, going from a 10pg dict to a 50pg dict and 500 pages at 500dpi.

I think I would have waited hours in the past with the regular minidjvu if I ever tried a 50pg dictionary option. I was very surprised how little difference in processing time there was between the 10pg dict and the 50pg dict options.

At 100 dpi, I got an error message, and noticed the required allocation cache memory was about 1,930 MB! And that is for the first 100 pages, which have fewer characters than subsequent 100 page segments.

I just read up a bit on how Linux manages memory, and was wondering if you think my laptop could make a 100pg dictionary DjVu if some memory parameters were adjusted, including a swap file. Speed is not of the essence, but having even a slowly made 100pg dict DjVu would be something on the dual-boot Linux laptop. I may try it later in Win 7 32-bit.

The baseline Lubuntu RAM use seems to be around 225 MB.

The Linux dual-boot laptop has a 250 GB HD, with 45 GB Linux.

I did only have 5 GB left on it, but in my experience, when watching videos, the machine only starts underperforming as it approaches 1 GB to 1.5 GB.

I backed off from the 100pg dict to a 75pg dict input, and got an error message, with about 1100 MB as the limit.

Then I tried a 70pg dict and also got an error message with about a 900 MB limit.

I next tried a 60pg dict and it worked. The highest value for the allocation of cache memory at 60pg dict was about 825 MB.

Here are some data regarding what I think are baseline, or close to baseline, memory data. (using command "free")

Total: 3023048; Used 225524; Free 1889372 Shared: 46068; buff/cache 908152; available: 2463932

If I recall correctly, I think I upgraded the RAM in the laptop from 2 GB to 4 GB, but found out that for some reason it would function as if 3 GB.

The following maximum memory amounts worked shortly before a 65pg dict attempt failed: 65 images 46 MB; Classifier 359 MB; Cache 822 MB.

Anyway, it is great that you have made the program successfully work in both Windows and Linux. Thank you, too, for exact command lines to enter.

Thoughts?

Take care, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 10, 2020 2:13 PM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hm.. Looks like I can't build with libjemalloc2 for 32bit on launchpad. 18.04 provides only jemalloc1. 19.04 isn't available for building. 19.10 dropped 32bit support. Only whitelisted packages can be build for 32bit for 20.04.

But, it looks like there is 32bit jemalloc1 for cosmic (18.10) in compatibility archieves. And most important - it looks like jemalloc1 and jemalloc2 may be installed on the same machine without any conflicts between them. So my suggestion is just install jemalloc1 on your machine from archieves.

Try these:

cd /tmp wget http://launchpadlibrarian.net/344123790/libjemalloc1_3.6.0-11_i386.deb wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/libminidjvu-mod0_0.9m01gbionic_i386.deb wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod_0.9m01gbionic_i386.deb

sudo dpkg -i libjemalloc1_3.6.0-11_i386.deb libminidjvu-mod0_0.9m01gbionic_i386.deb minidjvu-mod_0.9m01gbionic_i386.deb

wget https://launchpad.net/~truf/+archive/ubuntu/minidjvu-mod-temporary/+files/minidjvu-mod-gui_0.2gbionic_i386.deb

sudo dpkg -i minidjvu-mod-gui_0.2gbionic_i386.deb

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742546247, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDX3TQSMDHSFLFZ4IGTSUDJRRANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Could you check how many cores you have onboard with grep -c ^processor /proc/cpuinfo ? I expected minidjvu-mod to use two cores by default in case no -t parameter passed on dual core machine. You may also try to pass "-t 2" to try to force it use two cores for small dictionary (20p) and check if app output is the same as default with same dictionary. If it's then it was two cores and you may try to force it to use a single core with "-t 1" - this allow to save RAM twice in favor of processing time and try bigger dict. If it was already single thread then you have not much options without decreasing pages_per_dictionary or quaility of dictionary.
May be the only one is to change classificator from default "-C 3" to "-C 1" but I don't remember how much memory it needs. Probably the same amount. It's been more than a year since I looked into the minidjvu-mod code last time. Classifier -C 1 is pretty the same the original minidjvu used but without some limitations. It will do smaller djbz then original, but not close enough to commercial encoders.

Frankly, I'm usually use 20 pages per dictionary by default. I'm testing on 4-core 64bit 8Gb RAM machine and allocation of even 3Gb RAM already feels uncomfortable. My raw scans are 400dpi for text and 600dpi if page have any image on it. Before encoding all scans are scaled to 600dpi.

trufanov-nok commented 3 years ago

Regarding your 3Gb RAM. Show me the output of sudo dmidecode -t memory. You may use https://pastebin.com/ with autoexpiration to share big chunks of text

maple7-7-7 commented 3 years ago

Hi,

grep function says 2 cores.

(Win7 says processor 1.60 Ghz 1.60 Ghz.)

Thanks for the pastebin possibility. The following readout isn't too long.

Here is the readout for RAM . . . looks like 2 + 1 (as 1 + 2) ?

sudo dmidecode -t memory [sudo] password for stephen:

dmidecode 3.1

Getting SMBIOS data from sysfs. SMBIOS 2.4 present.

Handle 0x0012, DMI type 16, 15 bytes Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: None Maximum Capacity: 2 GB Error Information Handle: Not Provided Number Of Devices: 2

Handle 0x0013, DMI type 17, 27 bytes Memory Device Array Handle: 0x0012 Error Information Handle: No Error Total Width: 32 bits Data Width: 32 bits Size: 1024 MB Form Factor: SODIMM Set: 1 Locator: M1 Bank Locator: Bank 0 Type: DDR Type Detail: Synchronous Speed: 533 MT/s Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified

Handle 0x0014, DMI type 17, 27 bytes Memory Device Array Handle: 0x0012 Error Information Handle: No Error Total Width: 32 bits Data Width: 32 bits Size: 2048 MB Form Factor: SODIMM Set: 1 Locator: M2 Bank Locator: Bank 1 Type: DDR Type Detail: Synchronous Speed: Unknown Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified

Should help.


From: Alexander Trufanov notifications@github.com Sent: December 11, 2020 1:09 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Regarding your 3Gb RAM. Show me the output of sudo dmidecode -t memory. You may use https://pastebin.com/ with autoexpiration to share big chunks of text

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742899463, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDTSFPGLYKEUMYIQVKDSUFWLRANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

It looks like that but it can be double checked. I would remove one of memory banks, launch lubuntu and check the memory size. It should be 1 or 2. Then replaced that bank with another one and do the same. If it will be 1 and then 2 or 2 and then 1 - then it most probably a 2 + 1 RAM and may be you may have a chance to replace 1Gb bank with 2Gb - they should be cheap nowadays. Just make sure your laptop's motherboard specification supports 2+2. But If both banks of RAM say that they are 2Gb - then we have some options...

maple7-7-7 commented 3 years ago

A quick follow-up.

I will try to remove one bank later and do your tests. I am pretty sure I started with 2 GB and added to that. I think if I had started with a 1 and then a 2, I would have remembered "the day I tripled the memory." The laptop looks to be from an era when a 2GB rather than a 1 GB would be the standard, but that is not exactly scientific. Win7 2009.

Testing the -t 1 option on the 500 pages using the 20pg dictionary took twice as long as previously without using -t 1, about 9 min vs 4.5 min, so obviously using two cores originally. The DjVus are exactly the same number of bytes: 1569918.


From: Alexander Trufanov notifications@github.com Sent: December 11, 2020 1:04 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Could you check how many cores you have onboard with grep -c ^processor /proc/cpuinfo ? I expected minidjvu-mod to use two cores by default in case no -t parameter passed on dual core machine. You may also try to pass "-t 2" to try to force it use two cores for small dictionary (20p) and check if app output is the same as default with same dictionary. If it's then it was two cores and you may try to force it to use a single core with "-t 1" - this allow to save RAM twice in favor of processing time and try bigger dict. If it was already single thread then you have not much options without decreasing pages_per_dictionary or quaility of dictionary. May be the only one is to change classificator from default "-C 3" to "-C 1" but I don't remember how much memory it needs. Probably the same amount. It's been more than a year since I looked into the minidjvu-mod code last time. Classifier -C 1 is pretty the same the original minidjvu used but without some limitations. It will do smaller djbz then original, but not close enough to commercial encoders.

Frankly, I'm usually use 20 pages per dictionary by default. I'm testing on 4-core 64bit 8Gb RAM machine and allocation of even 3Gb RAM already feels uncomfortable. My raw scans are 400dpi for text and 600dpi if page have any image on it. Before encoding all scans are scaled to 600dpi.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742898253, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDX6HKFL2WEQRWU67Y3SUFV25ANCNFSM4NFYFATQ.

maple7-7-7 commented 3 years ago

Hi,

Okay, so here are all the details.

First, using single-threading rather than dual-threading allowed me to make a 75pg dict DjVu. Recall that I could not make a 75 or a 70pg dict DjVu with the first set of dual-threading DjVus. I also earlier tried making a 65pg dict DjVu but got an error message. (not documented)

I tried making the dual version again of a 75pg dict DjVu this time after rebooting to minimize unknown extra RAM use, but it still gave an error message when allocated cache memory said 1181 MB. The single-thread version had a maximum cache memory value of 1123 MB.

The fan ran fairly quietly for single-threading, but much louder for dual-threading. Recall that for dual-threading, the cache memory above 830 MB was associated with encoding error messages. So maybe single-threading is handling RAM "distribution" better than dual-threading is. (?) Or just that dual-threading is going to want more RAM in general, or both possibilities. (?)

I could not make the 100pg dict DjVu with single-threading, with the program trying to use about 1600 MB for cache memory during the first 100 pages. The dual version for a 100pg dict wanted about 1900 MB.

I took your advice to check out the two RAM memory bank chips. I figured I would just look at the labels first. If one was 2 GB and the other was 1 GB, that would be good, but if both were saying 2 GB, then we would need further research. It turns out that one is 2 GB and the other is 1 GB.

I checked the laptop's model and serial number, and Toshiba says the RAM capacity is 4 GB, which is good in that I might buy a new 2 GB and maybe get the ultimate 100pg dictionary DjVu to work on the laptop. It is not crucial, though, as we already have great results using the Win64 multi-threading version on Win10. And you are right that the RAM chips are not expensive.

One thing I don't know is what role the swap file function could play in this, especially if I go 4 GB and making the 100pg dict DjVus is a borderline situation. I wonder how adjustable things are that way.

Thoughts? And thanks again, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 11, 2020 1:47 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

It looks like that but it can be double checked. I would remove one of memory banks, launch lubuntu and check the memory size. It should be 1 or 2. Then replaced that bank with another one and do the same. If it will be 1 and then 2 or 2 and then 1 - then it most probably a 2 + 1 RAM and may be you may a chance to replace 1Gb bank with 2Gb - they should be cheap nowadays. Just make sure your laptop's motherboard specification supports it. But If both banks of RAM say that they are 2Gb - then we have some options...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742911396, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDTRNWRA4CP4623N2CTSUF24TANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Hi,

So maybe single-threading is handling RAM "distribution" better than dual-threading is. (?) Or just that dual-threading is going to want more RAM in general, or both possibilities. (?)

I suspect you just misinterpreting the error messages. The cache is allocated per thread. That's why you're able to encode with bigger djbz in a single thread mode - bcs 2-thread would require twice more RAM to encode 2 djbz at once. There should be 2 messages about cache allocation in minidjvu-mod's output in successful 2-thread execution. And then 2-threaded process complains on lack of RAM it's important: which one of the threads does this. It's thread no 1. can't allocate N Mb RAM, or thread no. 1 already allocated memory and that's thread no. 2 throw error message. So it may means 2-threaded can't allocate N Mb or it can't allocate 2*N Mb depending on output in console.

I don't know is what role the swap file function could play in this

swap won't help here at all. It doesn't allow application to allocate more RAM. It allow system with 2-cores and 1 Gb RAM launch 3 applications which require 400Mb RAM each providing that only 2 of them will be really running and the 3rd one will be paused and dumped to swap file on disk drive during that. Then it can be loaded back to RAM and another app will go to the swap. If system could do this really fast and processes aren't perform heavy calculations (web browser, MS Word in background...) you'll get impression that all 3 applications are running at the same time. And probably feel delays when switching from one app to another. The disk I/O speed is crucial here and the swap file is better to store on ssd drives instead of hdd. When minidjvu-mod says it can't allocate enough RAM that means system can't provide such amount even if it dumps processes that it consider not important enough to the swap file.
Probably the term "cache" is misleading in minidjvu-mod. If I recall right it's buffer in memory that remember results of characters comparison bcs comparison is takes significant CPU time and -C 2 / -C 3 classifiers are so meticulous that they may compare characters that where already compared bcs there are new details and it's a fuzzy comparison. It may answer: "yes, no, I'm not sure yet". Without reusing these results the process would take hours. Probably that's not a best approach or best classifiers. I hope someone sometimes improves these results without decreasing classifier quality.

maple7-7-7 commented 3 years ago

Thanks for the meticulous explanations of everything!

Interesting stuff, and all I might need are two degrees in computer science to safely tinker with any of the processes. lol

Thanks again, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 11, 2020 5:24 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi,

So maybe single-threading is handling RAM "distribution" better than dual-threading is. (?) Or just that dual-threading is going to want more RAM in general, or both possibilities. (?)

I suspect you just misinterpreting the error messages. The cache is allocated per thread. That's why you're able to encode with bigger djbz in a single thread mode - bcs 2-thread would require twice more RAM to encode 2 djbz at once. There should be 2 messages about cache allocation in minidjvu-mod's output in successful 2-thread execution. And then 2-threaded process complains on lack of RAM it's important: which one of the threads does this. It's thread no 1. can't allocate N Mb RAM, or thread no. 1 already allocated memory and that's thread no. 2 throw error message. So it may means 2-threaded can't allocate N Mb or it can't allocate 2*N Mb depending on output in console.

I don't know is what role the swap file function could play in this

swap won't help here at all. It doesn't allow application to allocate more RAM. It allow system with 2-cores and 1 Gb RAM launch 3 applications which require 400Mb RAM each providing that only 2 of them will be really running and the 3rd one will be paused and dumped to swap file on disk drive during that. Then it can be loaded back to RAM and another app will go to the swap. If system could do this really fast and processes aren't perform heavy calculations (web browser, MS Word in background...) you'll get impression that all 3 applications are running at the same time. And probably feel delays when switching from one app to another. The disk I/O speed is crucial here and the swap file is better to store on ssd drives instead of hdd. When minidjvu-mod says it can't allocate enough RAM that means system can't provide such amount even if it dumps processes that it consider not important enough to the swap file. Probably the term "cache" is misleading in minidjvu-mod. If I recall right it's buffer in memory that remember results of characters comparison bcs comparison is takes significant CPU time and -C 2 / -C 3 classifiers are so meticulous that they may compare characters that where already compared bcs there are new details and it's a fuzzy comparison. It may answer: "yes, no, I'm not sure yet". Without reusing these results the process would take hours. Probably that's not a best approach or best classifiers. I hope someone sometimes improves these results without decreasing classifier quality.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742977216, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDSMISUM4PPY2GK7QNDSUGUKBANCNFSM4NFYFATQ.

maple7-7-7 commented 3 years ago

Hi Alex,

I was wondering in your view how the shape-matching function might be simplified if one is dealing only with digitally produced documents with set typefaces from word processors, like a basic PDF would have, vs a scanned document with its often fuzzy comparisons.

A few years ago, I had this idea that for simple typefaces, instead of 1 character representing 1 shape, what would happen if we let 2 adjacent characters represent 1 shape? The letter W, for instance, already closely represents two "very adjacent" letters, VV.

Maybe have the more common pairings such as "ea" or "sh" or "ed", or "in" or "is" or "it."

Just set an arbitrary limit for each set of character combinations and experiment.

Maybe even go with 3 characters like "the" and "are" and "was" and "ing."

For 4 characters, how about "this" or "that" or "with" or "ould."

The selection hierarchy would run from more characters to fewer characters.

Another approach, have one symbol for each preposition.

Obviously encoding would take longer, and with several passes, but the new DjVu file size?

The fact that one could "do this forever" does not mean some limited sets could not be tried.

Thoughts?

Stephen


From: Stephen Jones forsej1@outlook.com Sent: December 11, 2020 5:41 AM To: trufanov-nok/minidjvu_mod reply@reply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Thanks for the meticulous explanations of everything!

Interesting stuff, and all I might need are two degrees in computer science to safely tinker with any of the processes. lol

Thanks again, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 11, 2020 5:24 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi,

So maybe single-threading is handling RAM "distribution" better than dual-threading is. (?) Or just that dual-threading is going to want more RAM in general, or both possibilities. (?)

I suspect you just misinterpreting the error messages. The cache is allocated per thread. That's why you're able to encode with bigger djbz in a single thread mode - bcs 2-thread would require twice more RAM to encode 2 djbz at once. There should be 2 messages about cache allocation in minidjvu-mod's output in successful 2-thread execution. And then 2-threaded process complains on lack of RAM it's important: which one of the threads does this. It's thread no 1. can't allocate N Mb RAM, or thread no. 1 already allocated memory and that's thread no. 2 throw error message. So it may means 2-threaded can't allocate N Mb or it can't allocate 2*N Mb depending on output in console.

I don't know is what role the swap file function could play in this

swap won't help here at all. It doesn't allow application to allocate more RAM. It allow system with 2-cores and 1 Gb RAM launch 3 applications which require 400Mb RAM each providing that only 2 of them will be really running and the 3rd one will be paused and dumped to swap file on disk drive during that. Then it can be loaded back to RAM and another app will go to the swap. If system could do this really fast and processes aren't perform heavy calculations (web browser, MS Word in background...) you'll get impression that all 3 applications are running at the same time. And probably feel delays when switching from one app to another. The disk I/O speed is crucial here and the swap file is better to store on ssd drives instead of hdd. When minidjvu-mod says it can't allocate enough RAM that means system can't provide such amount even if it dumps processes that it consider not important enough to the swap file. Probably the term "cache" is misleading in minidjvu-mod. If I recall right it's buffer in memory that remember results of characters comparison bcs comparison is takes significant CPU time and -C 2 / -C 3 classifiers are so meticulous that they may compare characters that where already compared bcs there are new details and it's a fuzzy comparison. It may answer: "yes, no, I'm not sure yet". Without reusing these results the process would take hours. Probably that's not a best approach or best classifiers. I hope someone sometimes improves these results without decreasing classifier quality.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-742977216, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDSMISUM4PPY2GK7QNDSUGUKBANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Hi, It won't worth it. Let's say any djvu-document contains embedded "font" (which djvu encoder constructs on the fly at encoding), illustrations (compressed with image algorithms) and "text" (a sequence of indexes of font glyphs). What you suggesting, let's say - a zip compression of the "text" part. But

  1. DjVu already stored it compressed. May be not so effective as zip, but the "text" part also contains coordinates of each character.
  2. The size of text part in file is much smaller than a size of "font".

Yes, we may say the PDF with typefaces from word processors can be deconstructed with a small "font". But we can't forget about letter coordinates. If we do so - that's not a DjVu anymore. It's not designed for that. It's kind of PDF to ".ttf+.doc+.jpeg deconstructor". And if we don't - the benefit from replacing 't' and 'h' with 'th' is ~12 bytes (not thinking about original "text" compression in DjVu). How many times should we replace it to compensate additional of a new glyph "th" to our "font"? If we don't - that's already not DjVu standard or at least not a current revision. And we should replace it often enough in a scope of one djbz... So I would say for such specific case of "typefaces from word processors" with a lot of efforts, risky assumptions, and probably changes in specification you'll get a few kilobytes filesize decrease which doesn't.
The idea of representation W as VV is a more "djvu-like" but it has something like that under the hood. DjVu can encode one letter glyph using the other glyph as a "prototype". I doubt that current encoders can make W from V, but E form F or G from C - maybe.

I don't know PDF format details, but it seems to be a mix of DOC + DjVu in a nutshell (in fact its remind DjVu a lot when encode scans). I think If you make it from Word it won't contain a glyphs of "font" but font name and text as DOC does. If you print it to JPEGs and try to reencode from images with regular PDF encoder I'm sure that you won't get PDF as small as it was as it will expect a scan and will be forced to construct and keep "font" inside like DjVu.
That means that if you have a bunch of PDFs made of word typesets it would be feasible to create a word2djvu encoder that know about word typesets and may rely on this during encoding for ex. by using a single shared "font" dictionary across all document pages. And with other tricks that may fit into DjVu standard. DjVu documents encoded by such djvu encoder will be unbeatable by other djvu encoders who accepts a scanned images as input.
I would say that you even don't need to convert PDF to DOC for such hypothetical word2djvu. You need to tweak some opensource pdf viewer who knows pdf format to export its content to djvu applying typesets knowledge. On the other thought there is pdf2djvu opensource tool and I would expect it to make supercompressed djvus from basic PDFs (didn't look into its code). If it's not it worth to add such feature to it. In other words the best way to encode simple PDFs to djvu should be pdf2djvu, - not regular images2djvu encoder. As well as for any other format X - specialized x2djvu tool will be a best option.

maple7-7-7 commented 3 years ago

Thanks. Another awesome set of explanations.

Take care, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 11, 2020 3:11 PM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi, It won't worth it. Let's say any djvu-document contains embedded "font" (which djvu encoder constructs on the fly at encoding), illustrations (compressed with image algorithms) and "text" (a sequence of indexes of font glyphs). What you suggesting, let's say - a zip compression of the "text" part. But

  1. DjVu already stored it compressed. May be not so effective as zip, but the "text" part also contains coordinates of each character.
  2. The size of text part in file is much smaller than a size of "font".

Yes, we may say the PDF with typefaces from word processors can be deconstructed with a small "font". But we can't forget about letter coordinates. If we do so - that's not a DjVu anymore. It's not designed for that. It's kind of PDF to ".ttf+.doc+.jpeg deconstructor". And if we don't - the benefit from replacing 't' and 'h' with 'th' is ~12 bytes (not thinking about original "text" compression in DjVu). How many times should we replace it to compensate additional of a new glyph "th" to our "font"? If we don't - that's already not DjVu standard or at least not a current revision. And we should replace it often enough in a scope of one djbz... So I would say for such specific case of "typefaces from word processors" with a lot of efforts, risky assumptions, and probably changes in specification you'll get a few kilobytes filesize decrease which doesn't. The idea of representation W as VV is a more "djvu-like" but it has something like that under the hood. DjVu can encode one letter glyph using the other glyph as a "prototype". I doubt that current encoders can make W from V, but E form F or G from C - maybe.

I don't know PDF format details, but it seems to be a mix of DOC + DjVu in a nutshell (in fact its remind DjVu a lot when encode scans). I think If you make it from Word it won't contain a glyphs of "font" but font name and text as DOC does. If you print it to JPEGs and try to reencode from images with regular PDF encoder I'm sure that you won't get PDF as small as it was as it will expect a scan and will be forced to construct and keep "font" inside like DjVu. That means that if you have a bunch of PDFs made of word typesets it would be feasible to create a word2djvu encoder that know about word typesets and may rely on this during encoding for ex. by using a single shared "font" dictionary across all document pages. And with other tricks that may fit into DjVu standard. DjVu documents encoded by such djvu encoder will be unbeatable by other djvu encoders who accepts a scanned images as input. I would say that you even don't need to convert PDF to DOC for such hypothetical word2djvu. You need to tweak some opensource pdf viewer who knows pdf format to export its content to djvu applying typesets knowledge. On the other thought there is pdf2djvu opensource tool and I would expect it to make supercompressed djvus from basic PDFs (didn't look into its code). If it's not it worth to add such feature to it. In other words the best way to encode simple PDFs to djvu should be pdf2djvu, - not regular images2djvu encoder. As well as for any other format X - specialized x2djvu tool will be a best option.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-743249151, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDVPY6UR4LMHWCJ2TKLSUIZDVANCNFSM4NFYFATQ.

maple7-7-7 commented 3 years ago

Hi Alex,

Merry Almost Christmas!

I have been experimenting a little more with your Win binaries.

Your Win64 GUI for minidjvu-mod seems to work pretty well, although during conversions, I don't see the progress bar moving in the progress-bar space.

Using the Win10 command prompt, I have been using your multi-threading Win64 binary.

I have been trying various lossy options for the 500 pages of PBMs.

Erosion of course is too strong, and Smooth slightly shifts the dark right margin to the left.

The options most useful to me seem to be the Clean and the Averaged Prototypes ones.

Could you please explain to me a bit more about the Averaged Prototypes?

It is a very useful function to me.

Using this function adds a bit to file size (1443 KB vs 1423 KB), but it is the only function that allows the dots above the letter i to be perfect octagons. All other approaches give dots that are not quite symmetrical. Even with any2djvu I have this problem, for the same input PBMs 4250x5500.

Thanks again, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 11, 2020 3:11 PM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi, It won't worth it. Let's say any djvu-document contains embedded "font" (which djvu encoder constructs on the fly at encoding), illustrations (compressed with image algorithms) and "text" (a sequence of indexes of font glyphs). What you suggesting, let's say - a zip compression of the "text" part. But

  1. DjVu already stored it compressed. May be not so effective as zip, but the "text" part also contains coordinates of each character.
  2. The size of text part in file is much smaller than a size of "font".

Yes, we may say the PDF with typefaces from word processors can be deconstructed with a small "font". But we can't forget about letter coordinates. If we do so - that's not a DjVu anymore. It's not designed for that. It's kind of PDF to ".ttf+.doc+.jpeg deconstructor". And if we don't - the benefit from replacing 't' and 'h' with 'th' is ~12 bytes (not thinking about original "text" compression in DjVu). How many times should we replace it to compensate additional of a new glyph "th" to our "font"? If we don't - that's already not DjVu standard or at least not a current revision. And we should replace it often enough in a scope of one djbz... So I would say for such specific case of "typefaces from word processors" with a lot of efforts, risky assumptions, and probably changes in specification you'll get a few kilobytes filesize decrease which doesn't. The idea of representation W as VV is a more "djvu-like" but it has something like that under the hood. DjVu can encode one letter glyph using the other glyph as a "prototype". I doubt that current encoders can make W from V, but E form F or G from C - maybe.

I don't know PDF format details, but it seems to be a mix of DOC + DjVu in a nutshell (in fact its remind DjVu a lot when encode scans). I think If you make it from Word it won't contain a glyphs of "font" but font name and text as DOC does. If you print it to JPEGs and try to reencode from images with regular PDF encoder I'm sure that you won't get PDF as small as it was as it will expect a scan and will be forced to construct and keep "font" inside like DjVu. That means that if you have a bunch of PDFs made of word typesets it would be feasible to create a word2djvu encoder that know about word typesets and may rely on this during encoding for ex. by using a single shared "font" dictionary across all document pages. And with other tricks that may fit into DjVu standard. DjVu documents encoded by such djvu encoder will be unbeatable by other djvu encoders who accepts a scanned images as input. I would say that you even don't need to convert PDF to DOC for such hypothetical word2djvu. You need to tweak some opensource pdf viewer who knows pdf format to export its content to djvu applying typesets knowledge. On the other thought there is pdf2djvu opensource tool and I would expect it to make supercompressed djvus from basic PDFs (didn't look into its code). If it's not it worth to add such feature to it. In other words the best way to encode simple PDFs to djvu should be pdf2djvu, - not regular images2djvu encoder. As well as for any other format X - specialized x2djvu tool will be a best option.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-743249151, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDVPY6UR4LMHWCJ2TKLSUIZDVANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Hi!

I don't see the progress bar moving in the progress-bar space

Yeah, that's a known issue. It's fixed but requires minidjvu-mod update. And I can't find a time to build updated version for Windows. Sooner or later I update the binary and it should be fine.

Could you please explain to me a bit more about the Averaged Prototypes?

The djvu compression is achieved by making a dictionary of glyphs found in source image and replacing the glyphs in image by their indexes from dictionary. If you found letter A 10 times in image you can put single A image to the dictionary and replace all its occurrences in image with the same index. The question is: which A image is to put into the dictionary? As encoder uses fuzzy matching methods the occurrences might be slightly different. By default the first A occurrence is put into the dictionary. But with averaging enabled encoder puts into the dictionary a new A image generated based on these 10 occurrences. This allows to avoid cases when some damaged or deformed glyphs are taken into dictionary while there is a lot of better glyphs of same letter in image. Smooth just smooths the image before encoding, Erosion erode the glyphs after encoding. And clean removes the glyphs that are too small (square is less than dpi*dpi/20000) to be meaningful from the dictionaries.

maple7-7-7 commented 3 years ago

Thanks for the detailed explanation once again!

Progress bar is not really a problem, as the result is what counts.

Just wasn't sure if you knew about it.

Thanks again, Stephen


From: Alexander Trufanov notifications@github.com Sent: December 24, 2020 9:56 AM To: trufanov-nok/minidjvu_mod minidjvu_mod@noreply.github.com Cc: maple7-7-7 forsej1@outlook.com; Author author@noreply.github.com Subject: Re: [trufanov-nok/minidjvu_mod] Can't get minidjvu_mod to work (#2)

Hi!

I don't see the progress bar moving in the progress-bar space

Yeah, that's a known issue. It's fixed but requires minidjvu-mod update. And I can't find a time to build updated version for Windows. Sooner or later I update the binary and it should be fine.

Could you please explain to me a bit more about the Averaged Prototypes?

The djvu compression is achieved by making a dictionary of glyphs found in source image and replacing the glyphs in image by their indexes from dictionary. If you found letter A 10 times in image you can put single A image to the dictionary and replace all its occurrences in image with the same index. The question is: which A image is to put into the dictionary? As encoder uses fuzzy matching methods the occurrences might be slightly different. By default the first A occurrence is put into the dictionary. But with averaging enabled encoder puts into the dictionary a new A image generated based on these 10 occurrences. This allows to avoid cases when some damaged or deformed glyphs are taken into dictionary while there is a lot of better glyphs of same letter in image. Smooth just smooths the image before encoding, Erosion erode the glyphs after encoding. And clean removes the glyphs that are too small (square is less than dpi*dpi/20000) to be meaningful from the dictionaries.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/trufanov-nok/minidjvu_mod/issues/2#issuecomment-750830199, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOBEQDVYA6FPGFUMX3UPSM3SWMF6DANCNFSM4NFYFATQ.

trufanov-nok commented 3 years ago

Ok, lets close this issue. I've opened a discussion section for minidjvu-mod project: https://github.com/trufanov-nok/minidjvu_mod/discussions Lets discuss non-bug related topics there.