art-institute-of-chicago / aic-bash

A bash script to query our API for public domain artworks and render them as ASCII art
GNU Affero General Public License v3.0
153 stars 16 forks source link

Error: invalid integer contant at line 445 #11

Open grince opened 1 month ago

grince commented 1 month ago

I just downloaded that wonderful bash-script for this wonderful idea.

I am running Debian 12, installed jq and jq2a additionally. but when I try to start the script without any parameter (so, a random oil painting should be shown), I receive the following:

$./aic.sh ./aic.sh: line 445: 16#: invalid integer constant (error token is "16#") Bacchic Revels, c. 1740 Johann Georg Platzer (Austrian, 1704–1761) https://www.artic.edu/artworks/46383/bacchic-revels

this happens also a second time: $./aic.sh ./aic.sh: line 445: 16#: invalid integer constant (error token is "16#") Queen Philippa at the Battle of Neville's Cross, c. 1789 Benjamin West (British, born United States, 1738–1820) https://www.artic.edu/artworks/57652/queen-philippa-at-the-battle-of-neville-s-cross

evilham commented 1 day ago

Hey, I also wanted to test this and ran against this issue.

It looks like there are a couple issues, one is that sometimes the row consists of <pre, which produces this particular error.

Once we get past that, the regex is not prepared to handle &nbsp;, so it passes a lot of <span style=...>&nbsp;</span> down the road producing the same issue inconsistently.

Here is a quick patch I prepared to play with things :-) Note that I replace &nbsp; with "U+202F   NARROW NO-BREAK SPACE".

diff --git a/aic.sh b/aic.sh
index 755d7bd..424b546 100755
--- a/aic.sh
+++ b/aic.sh
@@ -390,11 +390,16 @@ if [ ! "$OPT_FILL" = '--fill' ]; then
         for ROW in "${ROWS[@]}"; do

             # Transform spans into space-separated quadruples of R G B [Char], using pipes as span-separators
-            ROW="$(echo "$ROW" | sed -E "s/<span style='color:#([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2});'>(.)<\/span>/\1 \2 \3 \4|/g")"
+           ROW="$(echo "$ROW" | sed -E -e "s/&nbsp;/ /g" -e "s/<span style='color:#([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2});'>(.)<\/span>/\1 \2 \3 \4|/g")"

             # Discard the last pipe
             ROW="${ROW::-1}"

+           if [ "$ROW" = "<pre" ]; then
+                   continue;
+           fi
+
+
             # Split row into columns using pipes
             while IFS='|' read -ra COLS; do
                 for COL in "${COLS[@]}"; do
@@ -429,11 +434,15 @@ else
         for ROW in "${ROWS[@]}"; do

             # Transform spans into space-separated quadruples of R G B [Char], using pipes as span-separators
-            ROW="$(echo "$ROW" | sed -E "s/<span style='color:#([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2}); background-color:#([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2});'>(.)<\/span>/\1 \2 \3 \4 \5 \6 \7|/g")"
+           ROW="$(echo "$ROW" | sed -E -e "s/&nbsp;/ /g" -e "s/<span style='color:#([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2}); background-color:#([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2});'>(.)<\/span>/\1 \2 \3 \4 \5 \6 \7|/g")"

             # Discard the last pipe
             ROW="${ROW::-1}"

+           if [ "$ROW" = "<pre" ]; then
+                   continue;
+           fi
+
             # Split row into columns using pipes
             while IFS='|' read -ra COLS; do
                 for COL in "${COLS[@]}"; do