vitalab / vitalab.github.io

Website of the Vitalab :
https://vitalab.github.io/
55 stars 14 forks source link

cross_ref_images hook has invalid behaviors #465

Closed AntoineTheb closed 1 year ago

AntoineTheb commented 4 years ago

If an image is used as "link text", ie

[![](/article/images/foo/bar.jpeg)](https://www.baz.com)

The hook check_cross_ref_images.sh evaluates the link as the image's location and fails See https://github.com/vitalab/vitalab.github.io/blob/master/utils/check_cross_ref_images.sh#L42

For an example behavior, see #464

I'll try to fix this as soon as I can

lemairecarl commented 4 years ago

The regex here: https://github.com/vitalab/vitalab.github.io/blob/6f402b2217d3d80a2508908b65b91d40d28fb328/utils/check_cross_ref_images.sh#L65 is the cause. It's actually a POSIX regex or something, which seems not to be very common. It's not compatible with https://regexr.com , which is really sad because this tool makes regex tolerable.

Here is a regex that works: regexr.com/55tak

To use it, it seems we have to move away from sed as the regex interpreter.

I've succeeded with python:

import re
myre = re.compile('.*\!\[([^\[\]]*)\]\(([^\)]*)\)')
myre.search('[![](/article/images/foo/bar.jpeg)](https://www.baz.com)')[2]  # outputs: '/article/images/foo/bar.jpeg'

I've read that perl should work, but i haven't been able to make it work exactly right. I tried something like:

perl -pe 's/.*\!\[([^\[\]]*)\]\(([^\)]*)\)/\2/p' <<< '![allo](bye)'

which outputs bye for this test case, but does not work like python for the complicated cases.

I will conclude with 2 questions:

  1. Should we write our hooks in python instead? I don't see any reason to stick with bash, when we are all much more at ease with python. Bash sucks.
  2. Is there really a need to check that the image exists? We're supposed to see it when we review the post on GitHub. If there is an image missing, we should be able to catch it. Maybe not, if the text does not mention the images?
AntoineTheb commented 4 years ago

Ahhh, regexes. I would go with option 2, unless someone has a lot of bandwidth to allocate to fixing this issue.

lemairecarl commented 4 years ago

Ahhh, regexes. I would go with option 2, unless someone has a lot of bandwidth to allocate to fixing this issue.

I agree. It took me a lot of time today to try to debug this. I don't want to put more time in this. We could remove this hook at least for now.

@jhlegarreta do you remember what was the situation that prompted you to write this hook?

jhlegarreta commented 4 years ago

@jhlegarreta do you remember what was the situation that prompted you to write this hook?

We wanted a means to avoid pushing commits containing invalid cross-refs or cross-refs whose content file was missing.

lemairecarl commented 4 years ago

We wanted a means to avoid pushing commits containing invalid cross-refs or cross-refs whose content file was missing.

Makes sense. I have a very vague memory of a review where there was an image missing, but it was not obvious at first glance.

I guess for now we'll just leave it like it is, as it's pretty rare that we use images as links. A workaround to do it without having the failing hook would be to write it in html.

But if someone does it in markdown anyways, and the check fails, like in the case of Antoine, we know pretty quickly why, so we can ignore the :x: and merge anyway.