Closed qsantos closed 4 years ago
This is fantastic analysis - it should save me a bunch of time, thank you!
Only because this is so thorough, I'm curious if you also looked at how Anki Desktop handles the same item? Sometimes the code that should handle things the same way drifts apart over time (as desktop adds features or fixes issues and we haven't caught up), and they may have a different regex (or different method entirely) at this point. It's browsable at https://github.com/dae/anki
I'll look at incorporating a change for this either way but if you checked desktop handling it would maybe improve any fix here
Considering how much I have been using AnkiDroid, it's the least I could do!
I had not realized that the code of libanki
was so similar to that of Anki. Was it automatically converted from Python to Java? Anyway, here is the corresponding code in anki/template/template.py
, line 76 (2.1.2):
def render_sections(self, template, context):
"""Expands sections."""
while 1:
match = self.section_re.search(template)
if match is None:
break
section, section_name, inner = match.group(0, 1, 2)
section_name = section_name.strip()
# check for cloze
val = None
m = re.match("c[qa]:(\d+):(.+)", section_name)
if m:
# get full field text
txt = get_or_attr(context, m.group(2), None)
m = re.search(clozeReg%m.group(1), txt)
if m:
val = m.group(1)
else:
val = get_or_attr(context, section_name, None)
replacer = ''
inverted = section[2] == "^"
if (val and not inverted) or (not val and inverted):
replacer = inner
template = template.replace(section, replacer)
return template
There is no HTML stripping at this point. Trying to uniformize with Anki would thus imply moving this stripping out of render_sections()
(and tracking any code effected by this change). However, stripHTMLMedia
also exists in Anki, defined in anki/utils.py
at line 143 :
def stripHTMLMedia(s):
"Strip HTML but keep media filenames"
s = reMedia.sub(" \\1 ", s)
return stripHTML(s)
and reMedia
at line 133:
reMedia = re.compile("(?i)<img[^>]+src=[\"']?([^\"'>]+)[\"']?[^>]*>")
In this regex [^>]*
matches any character until the >
(it's roughly equivalent to .*?
here). So it might be simpler to just replace ?
by [^>]*
in the following line:
private static final Pattern imgPattern = Pattern.compile("<img src=[\\\"']?([^\\\"'>]+)[\\\"']? ?/?>");
By the way, I have just noticed this infinite while
loop, and it does not seems necessary. I am not familiar with the Java library but, in the Python's standard library, re.sub
accepts a callback, and I expect it to be possible as well in Java. At any rate, this is not quite as important. I should take the time someday to contribute back to the Anki ecosystem (hopefully before the Heat Death of the universe, but I promise nothing).
I had not realized that the code of
libanki
was so similar to that of Anki. Was it automatically converted from Python to Java
I wasn't around at the time, but I believe it was manual - just with an eye towards keeping the fidelity as close to 100% as possible so that 'upstream' changes (even though they are Python) could be easily adopted.
Seems to me the quick fix is just to alter the imgPattern in the way you mention (?
to [^>]*
), the medium fix is to reference the same pattern in both areas we appear to be doing the same thing (Utils.java and ImageField.java) and the full fix would be to re-harmonize libanki in this area? Does that sound about right?
I should take the time someday to contribute back to the Anki ecosystem (hopefully before the Heat Death of the universe, but I promise nothing).
:-) - I had never done real android development until last year and picked it up to see if I could fix a personal nit with AnkiDroid, I found the codebase and the tools pretty easy to work with. Especially for a fix that might be as fast as the quick fix - but one where you have the test all lined up - you might find it to be a 60 minute or so task, where the majority is just waiting for tools + emulator+ + dependencies installs. Pretty satisfying. Understandable if not interested though - I'll give this a look later after finishing 2.9 blockers
Seems to me the quick fix is just to alter the imgPattern in the way you mention (
?
to[^>]*
), the medium fix is to reference the same pattern in both areas we appear to be doing the same thing (Utils.java and ImageField.java) and the full fix would be to re-harmonize libanki in this area? Does that sound about right?
That's how I see it.
:-) - I had never done real android development until last year and picked it up to see if I could fix a personal nit with AnkiDroid, I found the codebase and the tools pretty easy to work with. Especially for a fix that might be as fast as the quick fix - but one where you have the test all lined up - you might find it to be a 60 minute or so task, where the majority is just waiting for tools + emulator+ + dependencies installs. Pretty satisfying. Understandable if not interested though - I'll give this a look later after finishing 2.9 blockers
Thanks for the information!
Tagged with "good first issue", as the medium fix should be easy to do (the testing will be the longest part I bet) and that would be a successful fix for now
Reproduction Steps
Illustration
containing<img src="bla.png" style="">
{{#Illustration}}<hr>{{Illustration}}{{/Illustration}
Expected Result
See the image from the
Illustration
field (below a horizontal ruler).Actual Result
No image, no horizontal ruler.
Debug info
-
Research
[ ✓ ] I have read the support page and am reporting a bug or enhancement request specific to AnkiDroid [ ✓ ] I have checked the manual and the FAQ and could not find a solution to my issue [ ✓ ] I have searched for similar existing issues here and on the user forum
Analysis
It seems to be caused by line 101 of
AnkiDroid/src/main/java/com/ichi2/libanki/Utils.java
:As an aside, note that this will also fail with filenames containing single or double-quotes. The regex
imgPattern
is then used bystripHTMLMedia
from line 230Note that this regex is also present in
api/src/main/java/com/ichi2/anki/api/Utils.java
. This methodstripHTMLMedia()
is used in the template engine to determine whether a field is empty or not. However, looking atAnkiDroid/src/main/java/com/ichi2/libanki/template/Template.java
line 140:Proposed Solution
Make the regex more permissive by using the same value as
PATH_REGEX
fromAnkiDroid/src/main/java/com/ichi2/anki/multimediacard/fields/ImageField.java
line 37.