Fossy-Cats / Git-Buch_EN

English translation of "Das Git-Buch" (The Git Book)
https://git.io/gitbook
Other
1 stars 0 forks source link

Rake Tasks for Dia to SVG? #26

Open tajmone opened 3 years ago

tajmone commented 3 years ago

I'm still undecided whether we should handle conversion of the Dia source projects to SVG via Rake or not.

Ideally, Rake could handle:

  1. A dia task to build all the SVG images (optimized via SVGO), thus replacing the current conversion script.
  2. Thanks to autogenerated file tasks, it would always rebuild SVG images when a their source Dia project has changed.
  3. Downloading the correct Dia version and unpacking it somewhere, so it can be used for the task (without interfering with another Dia version present on the end user machine). I.e. basically replacing the current installer script (which is Win only).

The problem here is the strict requirement for a specific Dia version (0.97) which apparently is only available for Windows (see alan-if/alan-docs/wiki/Dia-Diagrams and #12), which would then make the Rakefile unusable on other OSs (including Travis CI), especially if we enforce (2), which would automatically try to rebuild any SVG when its Dia source has changed, or if Rake is invoked via -B, or after clobbering.

Definitely, the Rake tasks will have to check the Dia version before attempting conversion, and skip the operation if the version is not the exact one required.

Before deciding on this, we need to find out if there's a Dia version for Linux and macOS that produces identical SVGs files from the same source projects — i.e. so that no changes are detected by Git after rebuilding an SVG from an unchanged project. If this was the case, we could then perform different version checks based on the OS (we could settle with covering Linux only right now, until we find a macOS user who can carry out the tests for us).

Alternatively, we should find a way to disable the Dia related tasks for those OSs for which we don't have a matching Dia version — this can be done, programmatically, via Ruby code, but detecting the host OS in Ruby requires some hacks that are not bullet-proof (we need to ensure that any Windows, Linux or Mac operating system is detected correctly).

Either way, we ultimately need to decide how to handle this

tajmone commented 3 years ago

Linux Testing

@SicroAtGit, I would need your help regarding the Linux tests to find a Dia version (and its link) that produces identical SVGs to those created by Dia v0.97 for Windows.

The problem is that testing will have to be done locally, since the build script in the repository also optimizes the SVG images via SVGO, which might introduce differences that compromise the tests (another issue altogether, which needs to be considered separately, after this is solved).

Another problem might the actual fonts used, which might not be identical on all OSs, or produce slightly different results because of differences in Dia internal libraries, etc. (not sure, but worth considering).

Overall, I think that managing to cover this task for all OSs and producing identical optimized SVGs is going to be tricky, considered all of the above, but the general problem remains because if different OSs produce differing SVGs, updating the diagrams is going to be problematic (except if either only one maintainer is in charge of this task, or if all SVGs are updated at once whenever there's a change).

The idea of having Rake detect changes to the Dia source and rebuilding the SVG images that require updating is quite important, but if this results in those images being overwritten endlessly every time a contributor force rebuilds all Rake tasks then we'd end up with spurious commits.

What's your view on this? should we just leave this out of Rake, and use the script whenever there's a need to?

Alternative, we could:

Combining the two solutions above would already improve the current state of affairs, at least until we can find a perfect solution (e.g. when finally a new Dia release is out for all OSs which handles CairoSVG correctly).

SicroAtGit commented 3 years ago

Can you please provide me a branch where the SVG files were created using only Dia (without SVGO)?

tajmone commented 3 years ago

Can you please provide me a branch where the SVG files were created using only Dia (without SVGO)?

There isn't one, I've added SVGO almost immediately to the script because the optimizations are considerable.

We'll probably have to either open a test repository for this or see if we can manage it via Gists.

I do have a test repository which we can use freely:

https://github.com/tajmone/testing

I'll start by inviting you as a collaborator. If you feel like copying the script over there (without the SVGO optimization part) and a sample Dia project, we could then check if I get identical results. Else, I can set it up later, when I find some time.

SicroAtGit commented 3 years ago

There isn't one, I've added SVGO almost immediately to the script because the optimizations are considerable.

Yes, that's why the question.

For you, it is quickly done to check out from the master branch to a new only-dia branch, comment out SVGO in build.sh, run the build script, commit the changes, and push the only-dia branch to this remote repository.

I can then check out the only-dia branch locally and run my Dia tests. Once Git reports no change in the working directory, I have found a solution that creates a matching result.

If you feel like copying the script over there (without the SVGO optimization part) and a sample Dia project, we could then check if I get identical results.

If you don't like my approach above, feel free to upload the SVG files to your test repository, provide them in a gist, or email them to me instead.

It would be better if I could check with my Linux locally and alone if my Dia tests are successful or not.

Linux (MacOS) support

As I wrote in the other Issues, older program versions are uncommon on Linux. In the package repository there is only one program version, except for a few exceptions, like python:

With the Dia version and the SVGO version from the package repository, I don't get identical SVG files that match yours. I have installed the fonts we specified, of course.

I installed the specified Windows version of Dia using WINE, adjusted the build.sh to use the WINE Dia version, and added the specified fonts to the WINE environment, but Git continues to report differences in the working directory when I run the build script. The build script continued to run the Linux version from the SVGO tool. So I think maybe the Linux version of the SVGO tool or the version of the SVGO tool is causing the difference.

To make it easier to find the culprit, I would like to focus on the Dia tool first (without SVGO).

If we find a WINE solution, maybe it will then also work on macOS with WINE.

tajmone commented 3 years ago

If you don't like my approach above, feel free to upload the SVG files to your test repository,

I've opted to create dedicated tests on that repository:

https://github.com/tajmone/testing/tree/master/dia

The build script will include in the generated SVGs info about the OS, Dia and SVGO versions used at each run, so we can store all versions even if SVGO is updated. Also, for each execution it will generate two separate SVGs, the plain SVG generated via Dia and its optimized version via SVGO, separately so we can better track if any differences are introduced by SVGO updates.

With the Dia version and the SVGO version from the package repository, I don't get identical SVG files that match yours. I have installed the fonts we specified, of course.

SVGO was updated since I last used it in the repository, and in the past I noticed that often an updated version will produce different SVGs. So it looks like this might introduce a further complication factor, since end users might have not updated SVGO timely.

A possible solution for Rake would be to split the conversion in two steps: first generate the plain SVG files in the Dia folder, then optimize them into the book images folder. The former could be ignored by Git, to avoid cluttering the repository with duplicate SVGs, but this would mean that Rake would always try to convert Dia source to SVGs at first run (since the files would be missing).

On the other hand, if we track the plain SVGs in the Dia folder, then Rake would only run Dia when a .dia project has changed, and SVGO only when their plain .svg has changed. The only real benefit would be the ability to force re-optimizing the SVGs without having to convert Dia sources (e.g. via rake -B svgo).

It's quite an intricate problem, due to all the variables at play here (fonts must be identical, not all Dia versions being available to all OSs, and SVGO being a tool that gets frequently updated).

I'm not really bothered about SVGO — if an image is optimized using a different SVGO version is not really a big deal, it's still optimized, and worst case scenario is that every time someone runs an optimization we end up having to commit all SVG diagrams again. The bad thing would be if two contributors keep using different SVGO versions, all the time, resulting in a bouncing effect of re-writing the same SVGs over and over again, but this should be mitigated by Rake, since SVGO should only be invoked when a Dia source changed, and only for that specific diagram! (again, another huge advantage of Rake over Bash scripts).

SicroAtGit commented 3 years ago

I am currently wondering if the SVG files are really required in the GitHub repository. Anyone can have them generated locally from the DIA files using the build script.

Regarding Linux support

The results of the Windows version of Dia running with Wine look good:

Here you can see the modified build script that runs Dia with Wine: build_wine.sh

tajmone commented 3 years ago

I am currently wondering if the SVG files are really required in the GitHub repository. Anyone can have them generated locally from the DIA files using the build script.

Setting up Dia is quite a pain, especially on non-Windows OSs, so I think it's worth tracking the SVGs to make it easier for end users to simply build the docs using Asciidoctor. If we make it too hard, it'll be a deterrent to contributions like typos fixes, etc., where the contributor is likely to have Asciidoctor installed but not Dia.

Also, disk space is cheap nowadays, so I don't think that the SVG files are going to make a huge difference, especially since they won't be updated that often once we reach the first stable edition.

Here you can see the modified build script that runs Dia with Wine: build_wine.sh

Thanks, that was really needed! I wish the situation with Dia not supporting the zero-width border on Cairo SVG in latest versions will be fixed soon, so that we could expect the latest Dia version to produce same results on all OSs. Right now, with the two different websites and different OSs releases not always covering all versions, it's a bit messy (just having to read the long explanations on which version of Dia it's required is not really encouraging).

Still, I think Dia is a superb tool like no other tool of its kind. So it's worth the pain.

SicroAtGit commented 3 years ago

Setting up Dia is quite a pain, especially on non-Windows OSs

Now if we use the Windows version of Dia also on Linux via Wine, the setup is also easy. I will write an installation guide or an installation script. The only pain for some might be the installation of Wine, because some don't want that mainly for security reasons.


If we make it too hard, it'll be a deterrent to contributions like typos fixes, etc., where the contributor is likely to have Asciidoctor installed but not Dia.

It would have prevented the following, but you're right:

The bad thing would be if two contributors keep using different SVGO versions, all the time, resulting in a bouncing effect of re-writing the same SVGs over and over again

I would additionally like to point out that the diagrams of Dia (Windows) and Dia (Wine) look largely identical, but still not completely identical (the text height in the boxes is slightly different), which, in addition to the different SVGO versions, also leads to different SVG files. The reason is not that the fonts in the diagrams are not fixed, because in the testing repository this is fixed in the sample diagram: https://github.com/tajmone/testing/commit/c18006578683c50960f81b66bc30c6591088d912


Still, I think Dia is a superb tool like no other tool of its kind. So it's worth the pain.

I see it the same way.

A possible solution for Rake would be to split the conversion in two steps: first generate the plain SVG files in the Dia folder, then optimize them into the book images folder. The former could be ignored by Git, to avoid cluttering the repository with duplicate SVGs, but this would mean that Rake would always try to convert Dia source to SVGs at first run (since the files would be missing).

When Rake creates the plain SVG files from the Dia files on the first run, the file dates are newer than the optimized SVG files which are already in the working directory from the beginning, which triggers Rake to recreate the optimized SVG files as well.

As far as I know, Rake compares the modification dates of the files to detect and decide when it is time to recreate the target files. Git does not commit file dates, so checked out files always have the current date.

What do you think about Rake monitoring the Dia files for changes and executing Dia and SVGO when the Dia files are changed? So, just a one-task conversion. The assets_src/img/dia/*.svg files should be ignored by Git.

Rake pseudocode:

file 'docs_src/images/*.svg' => 'assets_src/img/dia/*.dia'
on each changed Dia file {
  dia: 'assets_src/img/dia/file.dia' > 'assets_src/img/dia/file.svg'
  svgo: 'assets_src/img/dia/file.svg' > 'docs_src/images/file.svg'
}

After a Git checkout, the Dia files and the optimized SVG files have the current date. When the Dia files are modified, they have a more recent date than the optimized SVG files, which causes Rake to recreate the optimized SVG files.

tajmone commented 3 years ago

@SicroAtGit, sorry for the delay in replying.

I will write an installation guide or an installation script.

Thanks, that would be really useful; and I'll also reuse that text in the ALAN Docs Wiki, since we're using Dia for all our documentation.

The only pain for some might be the installation of Wine, because some don't want that mainly for security reasons.

Another problem might be 32-bit support, which I believe many Linux distros have been planning to drop since quite a while (IIRC, Ubuntu had already announced it, but postponed it due to high users request for a grace time).


I would additionally like to point out that the diagrams of Dia (Windows) and Dia (Wine) look largely identical, but still not completely identical (the text height in the boxes is slightly different), which, in addition to the different SVGO versions, also leads to different SVG files.

I was wondering whether I should tweak the build script so that it saves the unoptimized SVGs in the Dia folder (ignored by Git) and then optimizes them via SVGO only in the images/ folder. This would allow developers to compare the optimized and unoptimized SVGs (at least locally), to verify whether some of the problems might be due to SVGO, instead of Dia. It's possible, in fact I noticed that if I increase the optimization level above the default I get corrupted SVGs, so it could be a case of some SVG info being lost in the optimization process.

The reason is not that the fonts in the diagrams are not fixed, because in the testing repository this is fixed in the sample diagram: tajmone/testing@c180065

If it's only a matter of small differences in the resulting SVG source, but without affecting the final image, it should be OK (it also happens when using different versions of SVGO), but if it's a matter of lost SVG information, then we'll have to consider either retouching the optimization settings or dropping optimization altogether.


As far as I know, Rake compares the modification dates of the files to detect and decide when it is time to recreate the target files. Git does not commit file dates, so checked out files always have the current date.

Yes, that's the case. Even saving an unchanged file in the editor will cause Rake to see its "last modified" date as changed.

The reason I mentioned this was mainly because by not tacking the SVGOs at all would mean that the Rake build would end up depending on having Dia installed on the system to work — which would end up being a deterrent for the occasional contributor.

If I had a Dia task to the repository, it would have to check for the correct version of Dia being present on the system.

But then, Rake is not a dumb tool like Make, so I could just add to the Rakefile a few lines of code to check whether the required Dia version is present, and if not just programmatically drop the Dia task, so users who don't have Dia would still be able to build the rest of the repository, and see a warning note about Dia not being found on their system, and links to the instructions.

What do you think about Rake monitoring the Dia files for changes and executing Dia and SVGO when the Dia files are changed? So, just a one-task conversion.

That would be the correct approach, but it depends on whether we wish to make Dia a strict requirement for the build process or not. Once we agree on that point, we can decide whether to keep unoptimized copies of the SVG files in the Dia folder along with their SVGO optimized version in images/ (for manual comparison, since optimization could always affect images quality in the future).

Rake pseudocode:

If we decide to add a Dia task, we'll just handle it procedurally by using file patterns and a dedicated rule, so if a new Dia source is added to the dia/ folder it will automatically create a new SVG target for it — i.e. we'll create the list of the target SVG files by parsing all the .dia source files and manipulate their path and extension to generate a list of targets. This is a practical example of why Rake is a million times better than GNU Make.

After a Git checkout, the Dia files and the optimized SVG files have the current date. When the Dia files are modified, they have a more recent date than the optimized SVG files, which causes Rake to recreate the optimized SVG files.

Yes, that's the default behaviour (if the source has been modified more recently than the target then it triggers its build), which should be fine in most cases, even with version control. But changing the trigger rule in Rake is possible, since it's just a Ruby DSL. In one project, for example, I had to intervene in this sense to handle Asciidoctor warnings: if you set Asciidoctor failure-level to WARN the HTML document will still be produced, but it might not be as expected (e.g. missing images); in that case I've added a few line to change the creation and modified date of the HTML file to Epoch time 00:00:00, so that Rake will always try to rebuild the doc again.

I've already fixed this in alan-if/alan-i18n/_assets/rake/asciidoc.rb. but didn't find the time to implement it here yet.

So, whatever the need might be, there's always a solution in Rake, since it's just Ruby code (both Rake and the Rakefile are Ruby, so they can be both manipulated at execution time).

The problem with our Dia build script is that it's so Windows specific that I simply think it's not worth adding it to the Rakefile. We don't need to updated the SVG images often either (in theory, once we've finished tweaking them, we shouldn't be really needing to edit them at all), so I still think it makes more sense to handle the Dia task via a separate script — we could drop the Bash script and use a Ruby script, but I think it should be independent of Rake.

tajmone commented 3 years ago

I've now updated the Rakefile to invoke Asciidoctor with failure-level WARN (had to use the trick of changing the target file to Epoch "zero day" to handle that).

SicroAtGit commented 3 years ago

Another problem might be 32-bit support, which I believe many Linux distros have been planning to drop since quite a while (IIRC, Ubuntu had already announced it, but postponed it due to high users request for a grace time).

Yes, let's hope there will be a new Dia release for all OS in the near future.

It's even worse with macOS, because they already don't support 32-bit since some newer versions. So with Dia via Wine it looks bad there too.

But some Linux distributions still have 32-bit support, so I think Dia via Wine is currently a good solution.

I was wondering whether I should tweak the build script so that it saves the unoptimized SVGs in the Dia folder (ignored by Git) and then optimizes them via SVGO only in the images/ folder.

I think it's a good idea.

If it's only a matter of small differences in the resulting SVG source, but without affecting the final image, it should be OK

I compared them side by side with assets_src/img/dia/preview.html in Firefox - no content changes, only the font height is slightly different in the boxes and extremely few differences in the font rendering, see animated GIF: Difference_Win_Wine

The problem with our Dia build script is that it's so Windows specific that I simply think it's not worth adding it to the Rakefile. We don't need to updated the SVG images often either [...], so I still think it makes more sense to handle the Dia task via a separate script

Yes, let's leave it as it is.

we could drop the Bash script and use a Ruby script, but I think it should be independent of Rake.

No problem with switching from bash script to ruby script.

Would be good if it was then also Dia via Wine compatible. If a Linux environment was detected, just execute wine Dia/bin/dia.exe instead of Dia/bin/dia.exe.

tajmone commented 3 years ago

no content changes, only the font height is slightly different in the boxes and extremely few differences in the font rendering, see animated GIF:

Curious. I wonder if it's due to OS specific fonts handling libraries, differences in decimal numbers rounding, or whatever else. It's nothing big, but enough to trigger Git seeing those files as changed (I'm assuming there are differences in the coordinates values in the final SVG's XML). These small aspects are unlikely to ever be polished out if Dia uses separate repositories for Windows and Linux, since there can't be a unified test suite.

Dia is a project that really deserves sponsorship. If there was a company that could back financially its development we might see more active development (as in full time). Today the trend is to use various ASCII to SVG diagrams generators (diita, etc.) but the strength of Dia is that you can add custom elements sets to the library, by creating your own SVGs to support symbols of a new DSL, or whatever images one might need (e.g. a chessboard with all the chess pieces, to represent chess problems, etc.), which I've never seen in any other diagram application.

Would be good if it was then also Dia via Wine compatible. If a Linux environment was detected, just execute wine Dia/bin/dia.exe instead of Dia/bin/dia.exe.

That's a good idea! feel free to amend the Bash script accordingly. Right now, it might not be worth switching to Ruby, since it's a fairly simple script.

SicroAtGit commented 3 years ago

Looks like the fonts we have specified here are not installed properly on your Windows 10: assets_src/img/dia/README.md#diagrams-fonts

I just installed the fonts on my Windows 7 which I have in VirtualBox. Dia generates here a non-identical sample__Win7__Dia-0.97.svg to your sample__Win__Dia-0.97.svg, although they should be identical if your fonts are installed correctly.

However, these are identical:

tajmone commented 3 years ago

Looks like the fonts we have specified here are not installed properly on your Windows 10:

I've installed them in my user profile, to avoid overwriting the default system fonts, so they should be seen correctly by the OS, unless there's a problem with how Dia queries for the fonts. I'm not sure how this works in detail, but user fonts should have higher precedence over system fonts, so usually applications don't need to worry about this since it's handled transparently by the OS. But there's always the chance that Dia might be creating its own list of fonts by parsing the actual system folder, in which case when I run Dia it uses the original MS fonts instead.

There isn't much I can do in this respect (don't really want to override the default MS fonts, most applications are using correctly the fonts from my User profile).

although they should be identical if your fonts are installed correctly.

In theory, but it might also not be related to the fonts but to the OS layer that handles fonts. Win 7 and Win 10 are not the same OS, so they might be using different libraries to handle fonts transformations (actually, Win 10 definitely added some new font control features which you can enable and set from the Control Panel, to improve readability).

Furthermore, we don't know the details of the virtualization engines (VMWare, Wine, etc.), since they are emulating the OS the might just as well be delegating to Linux some common functions like fonts handling, etc.

From what I saw, the differences so far have been in alignment, or font baseline, all of which could be determined by a number of factors (different measuring units, rounding algorithms, etc.). It could also be due to the Cairo library, hard to tell.

The only sure test here would be to replace my fonts in the User folder with same-named font files with different glyphs (e.g. skulls instead of letters), so if I'm seeing letters it means it's using the system fonts and not the user fonts. The problem is that in order to do that I'd need to hack a dingbats fonts and change its metadata to mimic the target font (Incosolata, etc.), which requires a lot of work and the right tools.

SVG are vector images, so different viewers will never produce an identical raster rendition, unless they are using the same rendering library. So I wonder if it's even possible to ensure identical SVG results using Dia on different OSs, considering that Dia for Windows and Linux have different codebases, and the Cairo library involved might actually be a different version (and under Windows it might also be pre-compiled elsewhere). Fonts are vector, which are then translated from font glyphs to SVG paths, so chances are that along the line some system library ends up producing different coordinates due to differences in the way rounding is handled — we're speaking of rounding floating point values, so even if there's a standard, results might not always be identical.

SicroAtGit commented 3 years ago

There isn't much I can do in this respect (don't really want to override the default MS fonts, most applications are using correctly the fonts from my User profile).

You could temporarily delete the fonts in the user directory and then have Dia create the SVGs. If Git reports changes, Dia has detected your fonts in the user directory.

Do the fonts we use in the diagrams also exist in your system fonts?

Win 7 and Win 10 are not the same OS, so they might be using different libraries to handle fonts transformations

It could be, I don't rule it out. But when I saw that Dia generates the same SVG files in Windows 7 (VirtualBox) and Wine, I now started to think if maybe something is wrong with your SVG files.

(actually, Win 10 definitely added some new font control features which you can enable and set from the Control Panel, to improve readability).

These settings usually have an effect only on GUI texts, not painted texts.

Otherwise, Microsoft should issue a warning: "Finish your image painting first before changing this setting." Imagine Photoshop users who, after a Windows update, want to continue writing text in their image and suddenly notice differences.

The only sure test here would be to replace my fonts in the User folder with same-named font files with different glyphs (e.g. skulls instead of letters), so if I'm seeing letters it means it's using the system fonts and not the user fonts.

As mentioned above, temporarily deleting the fonts in the user directory would be a good quick test. Depending on this test, I can decide whether I create such a special font for you.

So I wonder if it's even possible to ensure identical SVG results using Dia on different OSs, considering that Dia for Windows and Linux have different codebases, and the Cairo library involved might actually be a different version (and under Windows it might also be pre-compiled elsewhere).

That's why I'm currently trying Windows Dia via Wine under Linux.

If at some point the Dia developers release a new version that is equally available for Windows and Linux, we can try it again with Windows Dia and Linux Dia.

Because Linux-Dia creates a black frame around each SVG and Windows-Dia does not, my only current choice is Windows-Dia via Wine.


If we have so many problems with the SVG files, Rake should definitely provide non-optimized SVG files as well (not tracked by Git).

tajmone commented 3 years ago

Do the fonts we use in the diagrams also exist in your system fonts?

Yes, they have been part of the MS fonts included in various Windows edition. The MS version are commercial (you can buy them separately too), but they are basically the same fonts, and the names are equal too. Legal questions about fonts are a bit complicated when it comes to classical typography fonts which existed before the computer era, but basically whoever re-produces a classic font owns copyright over the digital product.

Classic fonts (Times, Garamont, etc.) where widely used in typography, so to ensure that their metal specimen produced around the world were all alike, so that it would be possible to reproduce them almost identically using different press machines and spare parts from different manufactures, the details of the glyphs were always well documented and their design made available via fonts books.

When you re-create a font you use special software, but the building blocks for the classic fonts are the same used in non-digital production, i.e. dividing the specimens space using diagonals, circles, etc. Hence, good digital fonts reproduction should be almost identical among themselves, and even though Git will spot microscopic data differences in their vector paths, usually the eye doesn't. Also, good fonts software use standard algorithms for smoothing out fonts, which tend to produce almost identical (often entirely identical) results, since the goal is to reproduce classic fonts using the same canons of the past.

These settings usually have an effect only on GUI texts, not painted texts.

There are various libraries to handle fonts and other vector graphics (including SVG), and the same libraries are used to paint fonts in GUI applications, OS native controls, browsers, and produce raster images. Ultimately, you can only view raster graphics, for vector graphics are just a sort of blueprint of potential images. So, the difference can be the due to the graphic library being used, or its version. I believe that Wine relies on some MS components from Win NT that were made freely available and/or open source, and that Win 7 might still be using some of them too, whereas with Win 10 many of the old NT components were replaced with either newer versions or .NET counterparts. So it's hard to tell.

Bear in mind that between Win 7 and Win 10 there's a huge gap, like between XP and Vista, for Win 10 changes a lot of things under the hood.

Imagine Photoshop users who, after a Windows update, want to continue writing text in their image and suddenly notice differences.

I've worked with PhotoShop daily for around 10 years, and what you mention is nothing unusual really. Even the fonts shipped with the OS are updated quite frequently, including changes in the way ligatures are represented, and if you look at the list of fonts shipped with Windows releases you'll notice that same-named fonts found on Win 7 and Win 10 (for example) are not necessarily the same version.

Furthermore, TTF and OTF fonts are becoming an outdated standard today, being replaces by newer standards that support colours and many new features, which Dia unfortunately doesn't yet support. Some of these new fonts features might be rendered differently on different OS, e.g. colours since each OS uses its own colour profiling library. When you think about it, it's a bit like web contents and browsers, when you look closely no web page is shown identical on two different browsers, or even different versions of a same browser. The new standards are more like guidelines. Unfortunately, when it comes to version controlled files a generated image is either identical to it previous version or not, so guidelines are not good enough when tracking files generated with different tools.

As mentioned above, temporarily deleting the fonts in the user directory would be a good quick test. Depending on this test, I can decide whether I create such a special font for you.

Honestly, I don't want to tamper with the OS folders, in my line of work I need to keep the OS as close as the vanilla distribution as possible, to be able to test my products against a base installation. I don't think we have a real problem at hand here, what matters is that in any maintenance cycle there's one person in charge of building the SVG files. If in two years time someone else takes on the project, he/she will simply have to rebuild all the diagrams at once, with his/her OS and tools. By that time, probably a newer version of Dia would produce different SVGs anyhow, just like updates to SVGO produce differing output. That should be fine, as long as it doesn't lead to a situation where each contributor is generating differing SVGs for no reason (which is why I believe we should keep the diagrams out of the Rakefile).

That's why I'm currently trying Windows Dia via Wine under Linux.

If you want you can be in charge of building the SVGs from Dia sources, so we are sure we're using the correct fonts, for me is fine. Then if tweak the Dia project files I just test them locally without committing the generated SVGs, which you can add in a following commit.

Because Linux-Dia creates a black frame around each SVG and Windows-Dia does not, my only current choice is Windows-Dia via Wine.

The big question is Why? In theory, all Dia versions should be using the same Cairo library to create the SVG images, which is a cross platform library, so I don't see why on one OS the border is omitted and on another it's painted. Probably the Cairo lib relies on some OS native functions for some operations, which might produce different results. I really don't know.

If we have so many problems with the SVG files, Rake should definitely provide non-optimized SVG files as well (not tracked by Git).

I'm quite happy with the current solution, in the sense that (lack of transparent BGs aside) we finally have SVGs as planned, and even though it's not a perfect solution we have usable SVGs which we can tweak at any time if the need be (translate text, change colours, etc.). Cross platform software that really works is a recent thing, and in the past was more problems than satisfaction. Unfortunately of the three major OSs (Linux, Win, macOS) with which we measure "cross-platform", two are proprietary, which doesn't make things easy.

SicroAtGit commented 3 years ago

Thanks for the interesting information.

If you want you can be in charge of building the SVGs from Dia sources, so we are sure we're using the correct fonts, for me is fine. Then if tweak the Dia project files I just test them locally without committing the generated SVGs, which you can add in a following commit.

The Dia recognizes your fonts because they look pretty similar in your SVGs. In the GIF above, look at the rightmost 'M' in the box. In your SVGs there are extra pixels at the top of the letter and in my SVGs I have them below the letter.

I noticed that with the 'M' box there is an additional text box labeled also with 'M' above it. Probably the extra pixels are caused by this. Tomorrow I will examine this more exactly.

Today I had time to install Windows 10 in VirtualBox and natively on a notebook:

$ md5sum tajmone/*.svg
f8eba77e97ce038d733db4ded88e9ca3  tajmone/sample__Win10_native__Dia-0.97.svg
$ md5sum SicroAtGit/*.svg
51a6d319dd3c2da54b90ee8c42bd5ce4  SicroAtGit/sample__Win10_native_Notebook_fonts_installed__Dia-0.97.svg
8f9959bb02123ddcbe08d1bb5d6bb4e5  SicroAtGit/sample__Win10_native_Notebook_fonts_not_installed__Dia-0.97.svg
51a6d319dd3c2da54b90ee8c42bd5ce4  SicroAtGit/sample__Win10_VirtualBox_Lnx__Dia-0.97.svg
51a6d319dd3c2da54b90ee8c42bd5ce4  SicroAtGit/sample__Win7_VirtualBox_Lnx__Dia-0.97.svg
51a6d319dd3c2da54b90ee8c42bd5ce4  SicroAtGit/sample__Wine__Dia-0.97.svg

Console output, if fonts are not installed on Windows 10: sample__Win10_native_Notebook_fonts_not_installed__Dia-0 97

As you can see, even a native installation of Windows 10 on a completely different hardware (notebook) generates the same SVG files if the Dia version and fonts specified in this repository are installed.

The big question is Why? In theory, all Dia versions should be using the same Cairo library to create the SVG images, which is a cross platform library, so I don't see why on one OS the border is omitted and on another it's painted. Probably the Cairo lib relies on some OS native functions for some operations, which might produce different results. I really don't know.

Remember that for the Linux version of Dia I use the current version 0.97.3 because I have not had success compiling the 0.97 version from source so far.

I believe we should keep the diagrams out of the Rakefile

Yes, you are right. I thought it made sense to convert only the Dia files to SVG files that were changed, but because we get different SVG files, the unchanged SVG files then look inconsistent. As it is currently, it's fine.

tajmone commented 3 years ago

I noticed that with the 'M' box there is an additional text box labeled also with 'M' above it.

That's right, it's because of the stripes. I don't recall the precise details, but there a background text box with one of the two colours, then the stripes with the other colour on top of it, and then I think another box with just the border to cover the stripes overlapping the first border, and finally just a M character to cover the stripes. Something like that. It required this sort of hack to obtain the stipes (the original was a bit more messy IRRC). Also, the stripes are grouped, and the whole box is grouped into a single compound item.

Probably the extra pixels are caused by this. Tomorrow I will examine this more exactly.

Maybe. Check if there are two M letters (one in the text box further away and another at the very top). I probably left the former in order to properly align the latter, but it should be safe to remove the former since the object is now grouped and will always move its components altogether.

Console output, if fonts are not installed on Windows 10:

Mhhh, it seems that both fonts are missing, including Inconsolata. I don't remember about Open Sans, but I'm sure Inconsolata was natively installed on my Windows OS, but it's probably part of the Office suite that came included with it and has many extra fonts. Bear in mind that I originally installed Win 7, removed Office (almost immediately) and then migrated to Win 8, and finally to Win 10.

The full list of all the fonts that shipped with every Windows edition can be found here:

but the above list does not include the fonts from Office. When you buy a PC with pre-installed Windows OEM you usually get a free Office suite and some other lighter version of other MS packages. The actual contents of the OEM edition might vary from one PC producer to another, and possibly there might be variations in the included fonts based on the country too (I'm pretty sure that the vanilla Windows editions in Asia or the Middle East contain more non-European fonts than Western editions).

I believe that once you have installed a MS font on your system, it gets updated via Windows Update, regardless of how it got there, since it's a MS product.

Remember that for the Linux version of Dia I use the current version 0.97.3 because I have not had success compiling the 0.97 version from source so far.

I use 0.97.3 for editing, but not for the CLI builds because it caused a lot of problems.

I also noticed that your script errors mention the Pango library, not Cairo. I wonder why.

ImageMagick

ImageMagick is a good cross platform library and CLI tool:

https://imagemagick.org/

It now also supports SVG (haven't tried that, since I haven't used it in ages). Although it doesn't support the Dia format, I was wondering whether we could use it to solve the cross platform Dia issues by first exporting from Dia to a vector image format that can handle transparency and borders correctly on all OSS and Dia versions, and then use ImageMagick (or some other tool) to convert from that intermediate format to SVG.

As long as the intermediate vector format supports all the required information, the conversion to SVG should be lossy.

ImageMagick support a lot of command line options (and even project setting IRC), so it could even be used as a "normalizing filter" for SVG images.

tajmone commented 3 years ago

librsvg

From ImageMagick website on SVG support:

ImageMagick utilizes inkscape if its in your execution path otherwise RSVG. If neither are available, ImageMagick reverts to its internal SVG renderer.

The librsvg library (aka GNU RSVG) uses Cairo.

Since librsvg is now being rewritten in Rust (from C), chances are that it might soon become a cross platform library too, since Rust abstract away most of the complexity of making code work across different OSs, thanks to its Standard Library being cross-platform.

I'm just guessing here, since being cross platform (and especially supported under Windows) has never been a goal. But if it becomes a possibility that can be achieved without to much effort, I don't see why not.

Inkscape

It seems that inkscape is ImageMagick's first choice for handling SVG images. I have inkscape installed on my machine, via Chocolatey, so it's always up to date, even if I don't use it that often.

Inkscape is a GUI application for editing SVG images, along the lines of Corel Draw and other vector illustration apps.

It's a rather big application, unlike Dia, so it might be overkill as a dependency. But I believe it can also be used via the command line, and that it includes various standalone command line components that might be useful for manipulating SVGs.

I'm just mentioning these tools here since SVG and vector images are useful for any digital editions, not just this book, and there's always something new for us new to learn on this topic. Although I already knew Dia, this project offered me the chance to learn Dia in depth, especially the hacks and tricks when it comes to integrating it into a cross platform toolchain. And I'd love to discover new tools for handling vector illustrations for technical books, and learn how to use them.

SicroAtGit commented 3 years ago

Regarding the extra pixels at the 'M' label on the striped boxes: #32


I also noticed that your script errors mention the Pango library, not Cairo. I wonder why.

The Cairo plugin uses Pango: https://gitlab.gnome.org/GNOME/dia/-/blob/master/plug-ins/cairo/diacairo.c#L50


I would like to mention here again the tool CairoSVG. It is cross-platform, created Cairo-SVG files have no border and are transparent.

Example:

dia -n -t svg sample.dia
cairosvg -o sample_cairo.svg sample.svg

Available parameters:

$ cairosvg -h
usage: cairosvg [-h] [-v] [-f {eps,pdf,png,ps,svg}] [-d DPI] [-W WIDTH] [-H HEIGHT] [-s SCALE]
                [-b COLOR] [-n] [-i] [-u] [--output-width OUTPUT_WIDTH]
                [--output-height OUTPUT_HEIGHT] [-o OUTPUT]
                input

Convert SVG files to other formats

positional arguments:
  input                 input filename or URL

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -f {eps,pdf,png,ps,svg}, --format {eps,pdf,png,ps,svg}
                        output format
  -d DPI, --dpi DPI     ratio between 1 inch and 1 pixel
  -W WIDTH, --width WIDTH
                        width of the parent container in pixels
  -H HEIGHT, --height HEIGHT
                        height of the parent container in pixels
  -s SCALE, --scale SCALE
                        output scaling factor
  -b COLOR, --background COLOR
                        output background color
  -n, --negate-colors   replace every vector color with its complement
  -i, --invert-images   replace every raster pixel with its complementary color
  -u, --unsafe          resolve XML entities and allow very large files (WARNING: vulnerable to
                        XXE attacks and various DoS)
  --output-width OUTPUT_WIDTH
                        desired output width in pixels
  --output-height OUTPUT_HEIGHT
                        desired output height in pixels
  -o OUTPUT, --output OUTPUT
                        output filename
tajmone commented 3 years ago

The Cairo plugin uses Pango:

I see. An endless list of libraries within libraries, with apps ... It only takes a different version for an OS build to break guarantees of "pixel-perfect" identical results in the final output (although SVGs are vectors, but you get what I mean).

I wonder how much the fact that Dia has separate repos for the Linux and Windows version can affect Dia in terms of the libraries version it uses. But I understand that managing cross platform Dia in a single repository could complicate things for the maintainers.


I would like to mention here again the tool CairoSVG. It is cross-platform, created Cairo-SVG files have no border and are transparent.

I didn't know that, I knew Cairo only as a library.

I see it has quite interesting CLI options, giving fine-grain control over the SVG.

Intermediate SVG Manipulation Step to Enforce Transparency

I was wondering if we could add an intermediate manipulation step, before feeding the SVG images to SVGO, to enforce transparency. Since SVG images are just XML, we could assign to all elements that need to be transparent a specific colour (one that is unlikely to be used, e.g. a neon fuchsia) and then either:

This shouldn't be too hard to achieve, and we could ensure that the background boxes have transparent background and border (or zero-width border, whichever works best). Hoping that SVGO doesn't drop invisible elements as part of the optimization process.

It's worth a try, it could be a good solution until Dia can universally handle transparency and zero-width borders in all editions. Also, this fix would probably allow us to use any Dia edition to build the diagrams.

I think that the SVG standard allows enforcing padding and/or a specific canvas size, without having to use the BG Box hack, but unfortunately Dia doesn't seem to support this features, so we have to keep the BG Box for now (which is probably also easier to work with, since it's something that you can see in the GUI, unlike some abstract document settings).

Probably you can manipulate SVG images in all sorts of ways in Dia, using Python plugins, but the Windows edition doesn't seem to support Python plugins at the moment.

SicroAtGit commented 3 years ago

I figured out how we need to modify the CairoSVG files from Dia to make them transparent:

$ cd docs_src/images/
$ sed -i 's/fill-opacity:1;stroke-width:0/fill-opacity:0;stroke-width:0/' *.svg

The above solution works only with the Windows version of Dia (at least the version we specified). For the Linux version (which is more recent) I will also look for a solution, there also for the border problem.

SicroAtGit commented 3 years ago

After further thought and experimentation: Even though the above solution works for our current SVGs, it is not a universal solution for Dia-CairoSVGs because the search string is too imprecise. I'll drop the idea.

In my opinion, the best solution currently is to use the CairoSVG tool mentioned above for converting the SVGs to CairoSVGs. The transparencies are preserved and the border problem is also solved.

tajmone commented 3 years ago

After further thought and experimentation: Even though the above solution works for our current SVGs, it is not a universal solution for Dia-CairoSVGs because the search string is too imprecise. I'll drop the idea.

But I think that it could work, as long as we adopt a specific colour for the backgrounds that should be transparent (e.g. a neon fucsia, which we'd never use in the diagrams); then the RegEx can detect with precision those elements which need to be made transparent, and tweak their fill-opacity: value.

In my opinion, the best solution currently is to use the CairoSVG tool mentioned above for converting the SVGs to CairoSVGs. The transparencies are preserved and the border problem is also solved.

Would that still require to output from Dia using the CairoSVG format, or would it work even on the default SVG format supported by Dia?

In the latter case, then it would also solve the Dia version problem, and we could then use any late version of Dia, since the zero-width border problem only concerns the CairoSVG output format (i.e. when using the default SVG format the different Dia versions are consistent in their output).

That would be a huge relief. But we need to check that:

  1. The latest version of Dia (which is available for Win, Linux and macOS) produces consistent (identical?) SVG images on all OSs (or at least Win and Linux).
  2. The Cairo CLI tool can fix the BG border/colour transparency by processing an SVG image created using the default Dia SVG filter.

I don't know how the Cairo CLI tool works, but we can it target specific colours (border and BG) and change their elements to make the transparent?

It's not a problem to set the BG Box in the diagrams to use a dedicate colour (on the contrary it makes it easy to see if the frame is the right size) since the BG Box is in a separate layer which can be hidden from view while working.

SicroAtGit commented 3 years ago

But I think that it could work, as long as we adopt a specific colour for the backgrounds that should be transparent

For setting the color there are fill: and stroke: and for setting the transparency strength there are fill-opacity: and stroke-opacity:.

If fill: has our specified transparency color, fill-opacity: must be set to zero and if stroke: has our specified transparency color, stroke-opacity: must be set to zero. If the required -opacity: attribute does not already exist, it must be added.

I don't know if this is possible with sed alone. Anyway, it is not a simple task. I don't like it.

Would that still require to output from Dia using the CairoSVG format, or would it work even on the default SVG format supported by Dia?

The CairoSVG tool takes the Dia SVGs (default SVG format) and creates the CairoSVGs, which are transparent and have no black border.

$ dia -n -t svg file.dia
$ cairosvg -o file.svg file.svg

In the file file.svg the texts are now also paths, as it is with the output of

$ dia -n -t cairosvg file.dia
  1. The latest version of Dia (which is available for Win, Linux and macOS) produces consistent (identical?) SVG images on all OSs (or at least Win and Linux).

I will test it with the Windows and Linux version. I do not have a macOS system.

  1. The Cairo CLI tool can fix the BG border/colour transparency by processing an SVG image created using the default Dia SVG filter.

That is the reason why I mentioned the tool. Tested with the current Linux Dia version and the Windows Dia version specified in this project. I haven't tried it with the current Windows Dia version yet.

tajmone commented 3 years ago

This sounds really promising, and dispenses us from having to stick to a specific Dia version (which is not available for Linux and macOS, and which is problematic to install side by side with another Dia version in Linux).

What I don't understand is how the cairosvg tool can target specific elements to set their border and bg colour transparency (i.e. just the BG Box, not the diagram boxes). E.g. some diagrams might have white background, but their colour should be retained, e.g. in a template that doesn't have white as the default page colour. Is that possible?

SicroAtGit commented 2 years ago

The default SVG output files from the current Linux Dia version (0.97.3) and the current Windows Dia version (0.97.2) are not identical. They have different text heights, as in the GIF image above.

The default SVG output files from the Windows Dia version we specified in the project and the current Windows Dia version are not identical. The SVG attributes have a different order and color hexadecimal values are slightly different. But when I compare them in my SVG viewer, I don't see any differences. There must be very minimal color deviations.

Because of the different default SVG outputs, of course the CairoSVG tool generates different CairoSVG files.

Because of the text height differences between Windows Dia and Linux Dia, we should not use both. We should continue to use only Windows Dia because it is cross-platform usable (on Linux and macOS with Wine).

What we can do is to change the Windows Dia version specified in the project to the current one (0.97.2).

The CairoSVG files created by the CairoSVG tool from the default SVG files will keep the transparency and will not have the border problem, no matter which Dia version was used to create the default SVG files.

There is one bad thing about the CairoSVG tool: it is hard to install on Windows because pip3 install cairosvg does not install the required libcairo-2 library. In the bin directory of Windows Dia there is this library, but it can not be used because it is 32-bit and CairoSVG is 64-bit (probably depending on the installed Python bit version). I installed the GIMP program in Windows 10 on the test notebook, which includes the required libcairo-2 library in the 64-bit version, and added the directory where this library is located to the PATH environment variable. Therefore, it would be easier if you install the tool via the Windows 10 Linux subsystem.

As we noted, your fonts in the diagrams are not identical to mine and you can't do anything about that because you need your vanilla OS. What do you think about using Linux Dia as well? Then in the Windows 10 Linux subsystem you can install the fonts correctly and we both have identical diagrams.

What I don't understand is how the cairosvg tool can target specific elements to set their border and bg colour transparency

The CairoSVG tool does not make the decision whether it should be transparent or not depending on the color of the diagram elements. The elements are already transparent in the default SVGs and CairoSVG takes it as it is. The Dia built-in CairoSVG plugin adds white elements where the elements are actually set to be transparent.

Here is a diagram with a rectangle. The rectangle has a white background and the "Draw Background" property is set. Saved as default SVG:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/PR-SVG-20010719/DTD/svg10.dtd">
<svg width="13cm" height="9cm" viewBox="167 54 260 171" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
  <g>
    <rect style="fill: #ffffff" x="169" y="56" width="257" height="168"/>
    <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="169" y="56" width="257" height="168"/>
    <text font-size="12.8" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="297.5" y="143.881">
      <tspan x="297.5" y="143.881"></tspan>
    </text>
  </g>
</svg>

The same diagram again, but this time the rectangle element does not have the "Draw Background" property set:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/PR-SVG-20010719/DTD/svg10.dtd">
<svg width="13cm" height="9cm" viewBox="167 54 260 171" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
  <g>
    <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="169" y="56" width="257" height="168"/>
    <text font-size="12.8" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="297.5" y="143.881">
      <tspan x="297.5" y="143.881"></tspan>
    </text>
  </g>
</svg>

You see that a <rect line is missing. It is the rectangle with the white background that is missing, causing the rectangle to become transparent.

tajmone commented 2 years ago

The default SVG output files from the current Linux Dia version (0.97.3) and the current Windows Dia version (0.97.2) are not identical. They have different text heights, as in the GIF image above.

So there must be an OS layer behind these differences, since I believe you've tested this using identical fonts.

The current Windows Dia I have locally is actually 0.97.2-2, but that last -2 is probably not mentioned in the download version info (some minor patch on same version?).

The default SVG output files from the Windows Dia version we specified in the project and the current Windows Dia version are not identical. The SVG attributes have a different order and color hexadecimal values are slightly different. But when I compare them in my SVG viewer, I don't see any differences. There must be very minimal color deviations.

I wonder why the colour differences, probably colours are converted to another intermediate representation during the SVG conversion process (CLab?), so even if they were defined as hex values in Dia, and stored as hex colours in the final SVG, some minor value changes occur due to floating point rounding in the various conversions to another colour format.

Because of the different default SVG outputs, of course the CairoSVG tool generates different CairoSVG files.

Because of the text height differences between Windows Dia and Linux Dia, we should not use both. We should continue to use only Windows Dia because it is cross-platform usable (on Linux and macOS with Wine).

[...] The CairoSVG files created by the CairoSVG tool from the default SVG files will keep the transparency and will not have the border problem, no matter which Dia version was used to create the default SVG files.

OK, then we're still stuck with a specific Dia version (and Cairo tool too, to be safe) and the Win OS, but we gain that we finally have a fully transparent BG Box. That's fine, and well worth the effort.

NOTE — I've wondering if we should attempt to tamper with the Dia sources and try to come up with a trimmed down command line version that only converts from Dia projects to SVG (default), and nothing else. It should boil down to stripping away the whole GUI code, plug-ins, non-SVG libraries and code, etc., until you're left with a fairly small codebase, which most likely will compile on all OSs, regardless of whether we started from the Dia for Win repository or the Linux/macOS repo (for sure, the Win version has two separate binary files, one for the GUI and one for the CLI, IIRC the Linux version has only one binary).

We could name this tool dia2svg and distribute it in pre-compiled archives, bundled with the Cairo tool, so you'd only need to invoke it and would do all the magic in a single pass, CairoSVG reconversion included. It might not be as hard as it looks.

...

What we can do is to change the Windows Dia version specified in the project to the current one (0.97.2).

Sure. At least end users can install/keep only one version locally, and benefit from the latest features when working with the GUI.

The CairoSVG tool does not make the decision whether it should be transparent or not depending on the color of the diagram elements. The elements are already transparent in the default SVGs ...

Right, I forgot that this problem occurred only when using the CairoSVG filter (which solved the font problem but broke the transparency/borders one).

Here is a diagram with a rectangle ...

Interesting, when it has to draw the rect border it actually draws two rectangles, not just one (one filled, the other not, but the latter has a stroke value). When it "sees" a zero-width line is smart enough to simply not draw anything.

Ok, I think this is a good improvement in the toolchain, we'll not only gain the ability to preserve transparency of the BG Box (or any elements we need to) but we'll also be able to update the Dia version when/if new releases are out — we'd probably get slightly different SVG results, but that's not an Issue for it's justified, and it's better to have to re-commit all the SVG images now and then (as we'd have to anyway, due to SVGO updates) than being stuck with an old Dia version when Dia gets updated.

Could please provide me a link to the exact Cairo CLI tool you've been using?

I shall then start working on the new build system for the diagrams. I think we should also provide a Zip archive somewhere with the Dia and Cairo binaries (maybe just the CLI stuff) so end users (or even the build script/Rake) can just download it and unpack it (simple and easy).

What to you think?

SicroAtGit commented 2 years ago

I didn't realize you had already replied. Please take a look at my last edit as well.

tajmone commented 2 years ago

I didn't realize you had already replied. Please take a look at my last edit as well.

OK, I can see the new text!

There is one bad thing about the CairoSVG tool: it is hard to install on Windows because pip3 install cairosvg does not install ...

I still think we should try to come up with a Zip archive bundling all the required tools. I think on Windows it's fairly easy to do that, including find all the right DLLs (might take some time but it's doable). I really want to come up with an easy and straight "set it and forget it" solution, especially since I rely on Dia for many other projects too).

Can you provide me the final links to the Cairo CLI tool we'll be using? so I can start looking into it (or its installer, whatever the link offers).

Bear in mind that I also use MSYS2 locally, so it shouldn't be hard for me to build a custom Win package either, and then turn it into a standalone Zip to unpack and use.

As we noted, your fonts in the diagrams are not identical to mine and you can't do anything about that because you need your vanilla OS. What do you think about using Linux Dia as well?

I could try and come up with a fix for this (a simple script that copies the native fonts before replacing them, and another to restore them). But since we'll be using Wine, chances are that the fonts will be identical. Another solution might be to try and override the fonts path somehow in the script. I'll look into it.

tajmone commented 2 years ago

No to CairoSVG!

@SicroAtGit, after having spent over an hour looking into the issue, I'm discarding the CairoSVG tool as a viable options, because it isn't viable at all.

I originally thought it was part of the Cairo library project:

https://www.cairographics.org

which boils to down to using three simple DLL under Windows — it's the Cairo library, as used by the GTK project, which is also available for Windows.

But CairoSVG obviously something else, another product altogether. I though Python was required only for the toolchain (as is often the case), to build the binaries, etc., but it turns out it's an actual Python app.

These @!#@@$!! diagrams have eaten up 90% of the time we've devoted to the book. I don't see how adding any solution which does not consist in a binary standalone command line application is going to solve our problem.

Adding more dependencies (Python) is not going to help, and bundling it up as a "standalone Python package" would require tons of DLLs, files, etc. It simply doesn't make any sense to me. When you think about it, even the Dia project opted to leave the plug-ins feature out of Dia for Windows, in order to not have to deal with Python (and for good reasons).

We are speaking about the SVG format, i.e. nothing more than XML. All we need is to tweak a couple of XML attributes to enable transparency in the Box BG and its border (we already have the vector fonts now), so we might just as well pick a fixed colour, parse the XML and change those attributes.

If it's not a simple solution, if it's not a CLI tool (or a script), then I don't see it worth investing time in it. The diagrams are not the final product of the project, they are just accessory images. Sure, it would be nice to come up with a neat intermediate solution for post-processing Dia SVGs so that we finally get the transparency in the images, but since we already have decent diagrams for the book, I personally don't think the price is worth if it's going to bring in more complexity, dependencies, and even more versioning nightmare.

I was hoping to work on the chunked version of the book today, but most of the time has been consumed looking into CairoSVG, only to discover it hadn't really much to do with the Cairo library I knew.

The SVG format isn't that complicate, especially the SVG emitted by Dia, which sticks to very basic version of SVG. We could have easily have written our own SVG tool to fix the transparency, using PB and the Expat/XML library, a task that would have probably taken us no more than a couple of hours.

When I have time, I'll look into some ready built command line tools that can handle SVG, which are either cross-platform by design or Windows only (after all, it seems we can't escape using Dia on Wine). ImageMagik is a well known tool, so when I have a couple of hours to spare I'll download the latest version and see if it's of any help to us.

But I'm personally quite happy with the current Diagrams situation, even if don't have the transparency, at least we have the vector text (which is much more important IMO).

SicroAtGit commented 2 years ago

So there must be an OS layer behind these differences, since I believe you've tested this using identical fonts.

The fonts we have specified in this project are installed correctly.

Yes, many factors can be responsible for the difference. It can be that I have a newer Cairo library version in the system or between Dia 0.97.2 and Dia 0.97.3 there are changes that lead to these differences, or ...

Let's focus on Windows Dia. As I tested in detail above, when I run Windows Dia via Wine under Linux, I get exactly identical results as Windows Dia under native Windows (SVG files have the same MD5 hash). Even under other Windows versions, the same Dia version produces exactly the same SVG files (same MD5 hash), tested extensively above with Windows 7 and Windows 10, both tested native and in VirtualBox. The only requirement is that the fonts specified in the project are installed correctly.

The current Windows Dia I have locally is actually 0.97.2-2, but that last -2 is probably not mentioned in the download version info (some minor patch on same version?).

In the name of the download file, the version is mentioned in full: dia-setup-0.97.2-2-unsigned.exe. dia.exe -v mentions only the incomplete version 0.97.2. In the change log the version 0.97.2-2 is not mentioned: https://gitlab.gnome.org/GNOME/dia/-/blob/master/NEWS#L156 No idea what was changed between 0.97.2 and 0.97.2-2.

I wonder why the colour differences, probably colours are converted to another intermediate representation during the SVG conversion process (CLab?), so even if they were defined as hex values in Dia, and stored as hex colours in the final SVG, some minor value changes occur due to floating point rounding in the various conversions to another colour format.

May be. Or color calculations were incorrect and have been fixed in the new Dia version.

I've wondering if we should attempt to tamper with the Dia sources and try to come up with a trimmed down command line version that only converts from Dia projects to SVG (default), and nothing else. [...] It might not be as hard as it looks.

Even though I have almost no knowledge about C programming languages, we can take a look at the source code of Dia and think about it.

I could try and come up with a fix for this (a simple script that copies the native fonts before replacing them, and another to restore them). But since we'll be using Wine, chances are that the fonts will be identical.

You mean you use the Windows Dia via Wine in Windows 10 Linux subsystem? That would also be a solution. Wine has its own font directory, which keeps your Windows 10 clean.

I though Python was required only for the toolchain (as is often the case), to build the binaries, etc.

I didn't know that Python is also used as a building tool, so I assumed it was obvious that CairoSVG is a Python tool.

Adding more dependencies (Python) is not going to help

CairoSVG is the only command-line tool I have found that can solve our problem easily (apart from the installation under Windows). Yes, the many other dependencies are bad, but since the creation of the SVGs is not part of the main build and the SVGs need to be created very rarely, I didn't see it as that problematic. But the installation of CairoSVG on Windows is terrible, I realized that too late.

I was hoping to work on the chunked version of the book today, but most of the time has been consumed looking into CairoSVG, only to discover it hadn't really much to do with the Cairo library I knew.

Even though too much time has now been wasted, the realization that the CairoSVG tool is not an option is good. When searching the internet for svg to cairosvg, the search results mostly lead to this tool. At some point we would have come across this tool anyway, if not in this project, then in some other, I think.

We could have easily have written our own SVG tool to fix the transparency, using PB and the Expat/XML library

I will try to write such a tool in PB in the next few days (next weekend at the latest).

ImageMagik is a well known tool, so when I have a couple of hours to spare I'll download the latest version and see if it's of any help to us.

The packages of the portable versions of ImageMagick are unfortunately also very huge (about 100 MB): Imagemagick Windows (look for portable in the filename)

But I'm personally quite happy with the current Diagrams situation, even if don't have the transparency, at least we have the vector text (which is much more important IMO).

Yes, me too.

Let transparency be my problem. I know now that you are still looking for a solution that is completely portable (without installation), that can be easily provided in a ZIP package and only needs to be unpacked.

tajmone commented 2 years ago

No idea what was changed between 0.97.2 and 0.97.2-2.

It must be something really small, like an afterthought or typo corrections. I wouldn't worry about that.

Yes, many factors can be responsible for the difference. It can be that I have a newer Cairo library version in the system or between Dia 0.97.2 and Dia 0.97.3 there are changes that lead to these differences, or ...

Unfortunately, with these types of pipelines there's never the absolute certainty that the results will be identical — from system fonts being updated differently on different OSs, to small differences in tool or system libraries, up to how each OS (or even hardware) rounds decimal numbers ... every single tools (especially SVGO) adds an element of uncertainty.

The ideal solution with be a single tool that handles everything, not relying on system libraries but using ad hoc code for everything (including colour translation algorithms, etc.). I don't think there's a similar tool out there, mostly because these problems are specific to version control, whereas SVG as a standard is bound to be approximate, since vector graphics depend entirely on the rasterizing library for being viewed by end users (or to generate a raster image, etc.).

Our problem, in fact, is not that these small, tiny, minuscule differences have significant impact on the final images, but just that we want to avoid the build toolchain resulting in Git noise, i.e. spurious changes which are seen as changed file in the work area, which could lead to endless rewriting of the diagrams. If we keep the Dia build separate from the main Rakefile we should be good with that, and whoever needs to rebuild the diagrams will commit whatever there's to commit (i.e. including unmodified diagrams that produced differing SVGs) — that's an acceptable price to pay, considered that the diagrams should be updated rarely.

This includes the fonts problem (MS vs OpenSource version) IMO: even if you and I get differing results, who cares? If you rebuild the diagrams, just rebuild them all, and the same with me. The final results are almost identical to the end user.

BTW, I still think that even using identical fonts we might get non-identical results, because of the Win7 vs Win10 differences. Win10 uses a different fonts library, in fact it supports new fonts types which are not supported by Win7, plus other new features for ultra high resolution monitors, retina screens, etc. If the libraries are not identical, probably the results won't either. Win11 has entirely revamped the look and feel of the Desktop and user interfaces, so I guess this affects fonts too.

May be. Or color calculations were incorrect and have been fixed in the new Dia version.

Most good graphic libraries manipulate colours using CieLab, not hex values, and also support colour profiles for colours transformations. I've ported various CieLab algorithms of the Delta family (colour distance measuring) to PB, and I can assure you that the same algorithms produce different results, depending on the language used, the OS, and other factors. Even following the decimal precision rounding guidelines, I wasn't able to get identical results to those of the online test suite (microscopic differences in some colours, due to the various colour systems translations involved in the algorithm).

Probably our problem here is that we're using tools which are too advanced for our needs, since we're targeting HTML pages and we're fine just using hex colours. Even for the SVG tags, we only need the very basic ones from the standard vanilla SVG format supported even by old browsers (no animations, no CSS3 effects).

I wrote a Pixel Art plugin to convert raster images to SVG (pixel art only) in PB, and it took me less than a day. I didn't even had to use Expat, since the tags were so simple that I just manually wrote their string templates. But of course, when we start adding advanced and powerful tools (like Cairo and CairoSVG), which were designed with typography standards in mind (colours definitions for ink, not screen) then we might end up getting entangled in these problems because of the advanced algorithms at play.

Even though I have almost no knowledge about C programming languages, we can take a look at the source code of Dia and think about it.

If we strip away the GUI app code, what's left should be a fairly small command line tool. Since we'd only be interested in the SVG output filter, it shouldn't be too hard to strip away the code that handles the other images formats, the plugins system, etc.

That would be a good solution for it would provide a Dia-based tool which is independent from the Dia package, but still handles Dia source projects. If we achieved that, even integrating updates from Dia upstream shouldn't be to hard, since the output filters are like independent components.

But then, again, Dia projects are just XML. So parsing and interpreting them shouldn't be too hard in any language really. Let's not forget that the value of Dia lies in its simple to use GUI interface, and all the libraries and presets. Since I'll be using SVG diagrams a lot in my eBooks projects, I wouldn't mind investing energy in tools can that translate Dia project files into SVG images for HTML contents (vector text, transparency, zero-width borders, hex colours only, etc.), since this would still allow using Dia to design the diagrams, but use custom tools to generate the SVGs.

If I were to embark on this, I'd probably come up with a different logic for the final SVGs, e.g. allowing to enforce padding around the image through an external settings file; keeping all SVGs images proportionate by adding DPI info to the SVG, so they don't just blow up taking the entire space available, etc.

SVG is not a web specific format, which is why these SVG libraries are so complex. But focusing on a specific output format, and on diagram images only, narrows the scope. For example, in my Pixel Art plugin I was targetting pixelated images only, so I only needed squares and rectangles, working with RGB colours indexed palettes (max 256), since that's how Pixel Art works. This narrowed down the working scope to a very specific domain, were I could handle everything with simple strings.

From what I learned by peeping into Dia source projects, the way it incorporates library presets (which are usually just SVG images, although raster images are allowed too) if fairly simple, and colours are stored as hex triplets values. Shapes have very basic info too, like padding, radius, etc., and the same goes with fonts.

The hardest part would be converting text to vectors, which would have to be delegated to a specialized library — and, BTW, there are not many FOSS libraries to handle this, and most FOSS tools rely on the same libraries, albeit different version (or compiler options) because of potential conflicts with other libraries (or license issues). I can't remember the name of a famous library that handles fonts transformation (to raster images or vectors paths), it's on the tip of my tongue but I can't capture (something to do with Persian).

You mean you use the Windows Dia via Wine in Windows 10 Linux subsystem?

No, I meant a script that would create a copy of the system installed font, then replace it with the FOSS version, and the other way round. Currently, I'm installing the FOSS fonts in my User folder, which takes precedence over the System fonts (so I only have to delete the FOSS fonts and I'm back to the vanilla settings), but for some reason Dia is not seeing them — may running in emulation mode? using old deprecated WinAPI interfaces? No idea why, but I've followed the correct approach, by installing the fonts in the User folder instead of touching the System fonts (which are part of the Windows Updates tracked contents).

Using a script to override the sys fonts is the only other solution that comes to my mind — a horrible hack.

I didn't know that Python is also used as a building tool, so I assumed it was obvious that CairoSVG is a Python tool.

Like Rake with Ruby, there are many (more than in Ruby) Python build tools, packages installers, and test-suites which are often used in C/C++ projects. In fact you often see in GitHub repos Python showing up as 3%, or the like, which indicates Python being used in the toolchain only.

Yes, the many other dependencies are bad, but since the creation of the SVGs is not part of the main build and the SVGs need to be created very rarely, I didn't see it as that problematic.

But it's Python! I don't trust Python. Look what happened with Sublime Text 3, which had adopted a specific Python version for its plug-ins and API, with the assurance from Python org that that specific version would always receive security patches (or at least there was a very long-term LTS guarantee). But it never happened, and SHQ had to manually patch the Python runtime whenever there was a security concern. Python evolves, and I don't want to be stuck with an old and deprecated version, especially if it's not patched for security (even if it is, we'd have to replace the DLLs manually when this happens). I prefer a compiled CLI tool.

Even though too much time has now been wasted, the realization that the CairoSVG tool is not an option is good. When searching the internet for svg to cairosvg, the search results mostly lead to this tool. At some point we would have come across this tool anyway, if not in this project, then in some other, I think

Of course, we need to know "the scene". Sometimes I just get frustrated with these SVGs because I can't understand why such a simple format can give so many problems (in reality I know, it's because we're using tools that are too powerful, in our case).

Bear in mind that Ruby has various SVG libraries. Since manipulating SVG files translates to manipulate XML, it's all about walking the document tree and operating on its nodes, attributes/properties, etc. Turning an element to transparent shouldn't be hard at all, if you know what you're looking for. An XML parse would be more reliable than a RegEx based script (via SED). The problem is that Dia doesn't seem to provide nor allow meta-info, like giving to elements custom identifiers/labels, etc. Maybe the BG Box always comes first in the final SVG, since it's always in a separate layer in Dia, the bottom one. If that's the case, and knowing what colour and border size to expect, it should be trivial to make it transparent.

At least Ruby is already a dependency of this project (and usually all our AsciiDoc projects, since we always use the Ruby implementation).

The packages of the portable versions of ImageMagick are unfortunately also very huge (about 100 MB)

I always install similar tools via Chocolatey, so I don't have to remember to update them manually. I don't remember the package size, but 100 MB would be reasonable, considered that this tool supports hundred of image and colours formats, including the legacy images from the Amiga era. There are many complex algorithms, data tables, etc. in ImageMagick, which come at a price.

Let transparency be my problem. I know now that you are still looking for a solution that is completely portable (without installation), that can be easily provided in a ZIP package and only needs to be unpacked.

I'm sure we'll come up with one, eventually. But I don't want this to become a bocking problem for this repository, but rather a general need for our documentation projects. Of course, the whole discussion started here, so it makes sense to carry it on here, unless we end up creating a dedicated repository for this solution/tool.

Honestly, I'm quite interested in the SVG format, and have been for some time. I did occasionally read through the various Specs, and I'd like to study it in more detail at some point, including looking at libraries to handle vector images in SVG. But I'm aware that it's a big topic, so it's going to be a long-term project, full of small steps.

SicroAtGit commented 2 years ago

You mean you use the Windows Dia via Wine in Windows 10 Linux subsystem?

No, I meant a script that would create a copy of the system installed font, then replace it with the FOSS version, and the other way round.

Yes, I understood that, except your last sentence:

But since we'll be using Wine, chances are that the fonts will be identical.

In the GIF above, I compared your output SVG that you created with Dia on Windows 10 with my output SVG that I created with Windows Dia via Wine and they are not identical. Because of that, and because you want to keep your OS clean, I thought you'd try Wine with your Windows 10 Linux subsystem as well.

Currently, I'm installing the FOSS fonts in my User folder, [...] but for some reason Dia is not seeing them in the User folder

Just copying the font files to the correct directory is apparently not enough, you need to install the fonts for the current user.

When I installed Windows 10 natively on a notebook for the above tests, I installed the fonts using the right-click menu, as described here, and Dia recognized the fonts without problems (only installed for current user): https://www.tenforums.com/tutorials/26715-install-fonts-windows-10-a.html

Yes, the many other dependencies are bad, but since the creation of the SVGs is not part of the main build and the SVGs need to be created very rarely, I didn't see it as that problematic.

But it's Python! I don't trust Python. Look what happened with Sublime Text 3 [...] I prefer a compiled CLI tool.

Ok, I understand now.

The packages of the portable versions of ImageMagick are unfortunately also very huge (about 100 MB)

100 MB would be reasonable

I just checked how big the unzipped Dia is (about 60 MB). Ok, 100 MB is then also no problem.


I just wrote a tool in PB that looks for a specific color in the SVG files and sets the corresponding "-opacity" attribute to zero (transparent).

If it is ok to have the tool written in PB, I can create a repository for the tool this weekend. If not, I can publish the source code of the tool as a gist and you may be able to use it as an example of writing a variant in Ruby.

tajmone commented 2 years ago

Just copying the font files to the correct directory is apparently not enough, you need to install the fonts for the current user.

When I installed Windows 10 natively on a notebook for the above tests, I installed the fonts using the right-click menu,

I'll need to check this then. But I'm convinced that these two operations are identical, i.e. that the right click button is just a shortcut for copying to the User fonts folder.

But I'll try it anyway, this time using a font viewer tool to check if the target font has changed before and after the operation (font viewers also provide meta info about the installed fonts, so that might help).

I just wrote a tool in PB that looks for a specific color in the SVG files and sets the corresponding "-opacity" attribute to zero (transparent).

That's great!

If it is ok to have the tool written in PB, I can create a repository for the tool this weekend. If not, I can publish the source code of the tool as a gist and you may be able to use it as an example of writing a variant in Ruby.

If the tool is ready why not publish it? But I also believe that probably we can achieve the same result from within a Rakefile using Ruby (or just Ruby), which might spare dependencies for this project.

That said, the PB tool could grow in features in the course of time, so it might be nice to host it under the Fossy Cats. But if it's too much work, a Gist might be more than enough.

PS — in the past days I've managed to successfully tweak the asciidoctor-chunker tool so that it uses the GitBook prefix in the chunked docs, instead of chap. I now only have to start working on the Rake task on a dev branch.

SicroAtGit commented 2 years ago

I now had time to finish writing the previously mentioned tool into a complete command line tool: Fossy-Cats/SVG-Transparencyer

I will add compiled binaries for Windows and Linux there later.

PS — in the past days I've managed to successfully tweak the asciidoctor-chunker tool so that it uses the GitBook prefix in the chunked docs, instead of chap. I now only have to start working on the Rake task on a dev branch.

Very good!

tajmone commented 2 years ago

I now had time to finish writing the previously mentioned tool into a complete command line tool:

Fossy-Cats/SVG-Transparencyer

Great! And nice name too, I like it.

I intended to implement the split book this weekend, but I'm stack with a bad back-hake, so I'll have to postpone it since I can't sit in front of the PC for long,

SicroAtGit commented 2 years ago

Compiled binaries are now also available: https://github.com/Fossy-Cats/SVG-Transparencyer/releases/tag/v1.0.0-beta

Wish you a speedy recovery.

tajmone commented 2 years ago

Great!

Wish you a speedy recovery.

I already have, thanks. Now I'm just catching up on the piled up work. I ought to finish a couple of things and finally be able to dedicate an entire work session to the chunking issue, and look at the transparency too.

At least I've already managed to test some of the required asciidoctor-chunker tweaks, to enforce a custom file name. But I'll have to find an easy way to fix the split-documents titles too (i.e. use the Ch./Sec. title instead of the book title).