fletcher / MultiMarkdown-6

Lightweight markup processor to produce HTML, LaTeX, and more.
https://fletcher.github.io/MultiMarkdown-6/
Other
609 stars 90 forks source link

Missing Image Condition for ODT and Epub Could Be Improved #242

Open ipetraka opened 1 year ago

ipetraka commented 1 year ago

Version: MultiMarkdown 6 v6.6.0
Problem Description: error handling for missing images, when constructing an ODT file (and potentially ePub), will result in an invalid ODT file, and the error message that is given is confusing, refering to ePub construction.

Input

Title: Missing Image Test

# Missing Image Test

Dri srung gronk ozlint; zeuhl la, ti dri. Relnag xi nalista dri lydran wynlarce, prinquis zorl nalista, zeuhl re obrikt relnag erk wynlarce wex pank gronk? Menardis clum, morvit xu ma yem twock irpsa ma cree tolaspa. Erk teng flim obrikt; menardis nix frimba tharn nalista kurnap rhull.

![This image does not exist.](nothing.jpg)

Re teng thung; kurnap fli rintax ti nalista gra athran epp. Er lamax berot cree dri. La, morvit urfa quolt... er prinquis, pank obrikt quolt gen ma dri tharn athran relnag xi erc wex velar. Thung ik la flim urfa su ewayf thung. Berot wynlarce---gen nix srung athran er vusp gen, sernag jince. Ma er ma jince ma rintax ma wex ux wynlarce. Xu, zeuhl lydran ux erk. Sernag epp anu er cree ik korsa groum rintax velar ozlint velar thung vo korsa berot menardis er arul.

Call this file 'input.md', and then run the following command (in a folder with no images called 'nothing.jpg' of course):

$ multimarkdown -t odt -o test.odt input.md

Output

STDERR output states:

"Unable to store 'nothing.jpg' in EPUB"

As I say, a bit confusing since we aren't making an ePub, but the basic nature of the problem is conveyed.

The second issue is that the resulting ODT file is not valid, or at least LibreOffice doesn't like it enough to try to open it. The error message it provides is not informative as it references that last line of the document as the error point. It's probably just that there are missing assets being referenced.

I ran a quick check of ePub, given the wording of the error message, and found a similar result there. The pointer to the missing image is added to the content, which results in an .epub that fails validation.

Suggestions

fletcher commented 1 year ago

Thanks, as always, for the feedback!

If MultiMarkdown gives an error message, it means there is something wrong that cannot be recovered and the resulting document cannot be guaranteed to be valid. Of the top of my head, I think a missing image/asset like this is the only situation that causes such an error. But if there's an error message, there is REALLY an error and the problem should be fixed before relying on the output. (I know some programs give errors when everything is actually more or less ok, but MultiMarkdown is not one of them.)

I'll look at changing the "EPUB" wording when I get around to it (low priority, but easy).

At this time, I don't plan on placeholders or anything like that. (I think this might be even worse, as then you might think your document is ok, but in fact there is a missing image on page 47 out of 50.)

I will consider returning an empty string in the event of an error like this. The problem is that this one situation runs counter to everything else about how Markdown works, which is that you always get a result from converting Markdown, it just might not be what you thought you intended in the event of a syntax error. I'll have to see whether it fits into the flow of the program or requires a lot of rewriting. If relatively easy, I'll return empty string (but again, low priority). If complex, I probably will not change this as I don't think it's worth the effort.

In short, if there's an error, fix the error and don't use the file. ;)

ipetraka commented 1 year ago

Thanks for the update! For what we're doing, we just need to better trap when the tool exits in an error state and disregard any files that may have been created in the temp folder. We ran into a bug where the error wasn't handled at all so the user just mysteriously ended up with an .odt file that wouldn't open.

So, that's good to know errors are always serious anyway, and worth terminating the compile process over after relaying the message to the user.

fletcher commented 1 year ago

The EPUB/Opendocument error message has already been fixed in the development branch some time ago. You may wish to use latest code from develop -- not sure when I'll make the next merge into master.

It will take rewriting the code to allow an error like this to result in cancelling the output. Not that it can't be done, but it will take a bit of time to modify the various functions/parameters to handle it. And it's low priority compared to other things I am working on, so no promises on when it will happen.

ipetraka commented 1 year ago

Thanks, I'll keep that in mind, but for this next upcoming Windows Scrivener release, I think we'll stick with stable just to play it safe. Ideally the user should never get a missing images error anyway, since Scrivener is managing the images and the syntax pointing to them. True, one could use Markdown syntax more directly to target an image outside of the project incorrectly, but most of our users manage their images in the software, through one of the diverse ways of doing so. It was just a silly bug on our part that they were not being exported to where MMD expected them to be.

As for handling the error, sorry if I wasn't clear, but I was speaking for what we would do. We were not handling errors before, but going forward STDERR will be piped into a dialogue box and we won't write the output if that condition arises. It would be up to the user to resolve broken image links, if they are one of the rare ones that does so.

Hope nvUltra is going well! I do miss it. I moved on from the Mac a couple of years ago, back to Linux. The Mac got a bit too overbearing for my taste.

fletcher commented 1 year ago

At this point, stable is probably not the most accurate adjective for the master branch. It's more apt to say, "It's been long enough to stick a new version tag on this and merge the changes into master and take the time to make some new binary installers."

So it's more based on laziness than any true version of stability.... ;)

For example, the last commit was in May, so it's been 6 months.

Additionally, I use develop when building nvUltra and beta versions of MultiMarkdown Composer v5.

But I understand wanting to be safe.

ipetraka commented 1 year ago

Understood! Well I might compile it, depends on my schedule.

By the way we did in the end find one thing that struck us as odd about MMD CLI input behaviour, that was ultimately part of what was causing the problem. Part of the problem is he was executing the script from the wrong path, so it wasn't finding the images relative to the execution point. I figured that was all of it, but once the working path was sorted, we found it still didn't work.

The issue is that on Windows and Linux, this works:

$ cd ~/working_folder
$ multimarkdown -t odt -o file.odt input_file.md

But this does not work, the images will not be found:

$ cd ~/working_folder
$ multimarkdown -t odt -o file.odt /home/av/output_folder/input_file.md

For some reason, an absolute path given as the input has a different behaviour than a relative referencing of that same file. To my understanding, there should be no functional difference between these two. ~/something, something and /home/myname/something are different ways of saying the same thing. And indeed the Mac build works correctly way with both methods; it finds the images. So I think it's specifically a Win/Linux weirdness.

Maybe I'm wrong though, and this is actually something POSIX related that I've never encountered before.

Of course it's easy to avoid this---we don't need to use an absolute path if we are running the script from the right location. So we've got the ODT compile workflow once again working smoothly in the software, but I figured you might want to know of this as some automation and tools and such might use absolute paths always because that's often the easiest fetch for a filename target.