Kozea / WeasyPrint

The awesome document factory
https://weasyprint.org
BSD 3-Clause "New" or "Revised" License
7.23k stars 686 forks source link

"Error: expected string or bytes-like object, got 'int'" -- When using div appended images #2209

Closed chofstrand closed 3 months ago

chofstrand commented 3 months ago

Hi all,

In Python, I'm creating a PDF from an HTML file made from a Jinja2 template. An abbreviated form of the HTML looks like this: image

However, when I run: HTML(html_file).write_pdf(r'output/final_report.pdf', mime_type='image/png', optimize_images=True) I get:

{
    "name": "TypeError",
    "message": "expected string or bytes-like object, got 'int'",
    "stack": "---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File c:\\Users\\defen\\Documents\\Flagging\\flagging to proto:27
     24         f.close()
     26 # Write HTML to PDF
---> 27 HTML(html_file).write_pdf(r'output/final_report.pdf', mime_type='image/png', optimize_images=True)
     29 # Delete graphs
     30 # shutil.rmtree(\"graphs\")

File c:\\Users\\defen\\Documents\\Flagging\\venv\\Lib\\site-packages\\weasyprint\\__init__.py:166, in HTML.__init__(self, guess, filename, url, file_obj, string, encoding, base_url, url_fetcher, media_type)
    163     base_url = str(base_url)
    164 result = _select_source(
    165     guess, filename, url, file_obj, string, base_url, url_fetcher)
--> 166 with result as (source_type, source, base_url, protocol_encoding):
    167     if isinstance(source, str):
    168         result = html5lib.parse(source, namespaceHTMLElements=False)

File ~\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\contextlib.py:137, in _GeneratorContextManager.__enter__(self)
    135 del self.args, self.kwds, self.func
    136 try:
--> 137     return next(self.gen)
    138 except StopIteration:
    139     raise RuntimeError(\"generator didn't yield\") from None

File c:\\Users\\defen\\Documents\\Flagging\\venv\\Lib\\site-packages\\weasyprint\\__init__.py:374, in _select_source(guess, filename, url, file_obj, string, base_url, url_fetcher, check_css_mime_type)
    372 elif isinstance(guess, Path):
    373     type_ = 'filename'
--> 374 elif url_is_absolute(guess):
    375     type_ = 'url'
    376 else:

File c:\\Users\\defen\\Documents\\Flagging\\venv\\Lib\\site-packages\\weasyprint\\urls.py:103, in url_is_absolute(url)
    101 \"\"\"Return whether an URL (bytes or string) is absolute.\"\"\"
    102 scheme = UNICODE_SCHEME_RE if isinstance(url, str) else BYTES_SCHEME_RE
--> 103 return bool(scheme.match(url))

TypeError: expected string or bytes-like object, got 'int'"
}

I believe it has something to do with how I have embedded the images into the html template:

<!DOCTYPE html>
<html>
<head>
  <title>{{ page_title_text }}</title>
  <meta name="finalData" finalGraphs="{{ imagePaths }}">
</head>
<body>
  <h1>{{ title_text }}</h1>
  <h3>A Clarkity Control Report</h3>
  {{ finalTable|safe }}
  <div id="myImg"></div>
  <script>
    const finalPaths = JSON.parse(document.getElementsByName('finalData')[0].getAttribute('finalGraphs'));
    var allPics = finalPaths.length;
    console.log(allPics);
    for (var i = 0; i < allPics; i++) {
      var img = document.createElement('img');
      img.src = finalPaths[i];
      document.getElementById('myImg').appendChild(img);
    }
  </script>
</body>
</html>

I used this approach because the images need to be shown in an order specified in {{ imagePaths }}.

Any advice or input would be greatly appreciated!

liZe commented 3 months ago

Hi!

WeasyPrint doesn’t execute JavaScript at all, so I suppose that you generate your final HTML file with some other tool. Could you please share your real final HTML file that includes img tags?

liZe commented 3 months ago

Or maybe html_file is just a number instead of a filename. 😄

chofstrand commented 3 months ago

WeasyPrint doesn’t execute JavaScript at all, so I suppose that you generate your final HTML file with some other tool. Could you please share your real final HTML file that includes img tags?

Yeah... that could be an issue 🥲 lol. The (abbreviated) output from the template looks like this, and this is what Weasyprint is trying to turn into a PDF:

<!DOCTYPE html>
<html>
<head>
  <title></title>
  <meta name="finalData" finalGraphs="[&#34;..\\graphs\\blankvsQCs.png&#34;, &#34;..\\graphs\\P_31-02_42_batch1.mzML_whole.png&#34;, &#34; [...], &#34;..\\graphs\\P_HMN35-5_19_batch1.mzML_whole.png&#34;, &#34;..\\graphs\\P_HMN35-5_19_batch1.mzML_L-Glutamine.png&#34;, &#34;..\\graphs\\P_HMN35-5_19_batch1.mzML_L-Glutamic Acid.png&#34;]">
</head>
<body>
  <h1>polar_neg</h1>
  <h3>A Clarkity Control Report</h3>
  <table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Warnings</th>
      <th>L-Glutamine Peak Area</th>
      <th>L-Glutamic Acid Peak Area</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>Blank_3_52_batch1.mzML</th>
      <td>-</td>
      <td>0.0</td>
      <td>0.0</td>
    </tr>
    <tr>
      <th>Avg Blank</th>
      <td>-</td>
      <td>0.0</td>
      <td>0.0</td>
    </tr>
    <tr>
      <th>PQC_1_04_batch1.mzML</th>
      <td>-</td>
      <td>0.0</td>
      <td>0.0</td>
    </tr>
    [...]
    [...]
    [...]
    <tr>
      <th>P_HMN35-5_19_batch1.mzML</th>
      <td>[Blank peak larger than sample peak for L-Glutamine, Blank peak larger than sample peak for L-Glutamic Acid]</td>
      <td>0.0</td>
      <td>0.0</td>
    </tr>
  </tbody>
</table>
  <div id="myImg" Content-type="image/png"></div>
  <script>
    const finalPaths = JSON.parse(document.getElementsByName('finalData')[0].getAttribute('finalGraphs'));
    var allPics = finalPaths.length;
    console.log(allPics);
    for (var i = 0; i < allPics; i++) {
      var img = document.createElement('img');
      img.src = finalPaths[i];
      document.getElementById('myImg').appendChild(img);
    }
  </script>
</body>
</html>
liZe commented 3 months ago

Is html_file really a filename in your script, isn’t it an integer?

chofstrand commented 3 months ago

Hoo boy: I did some bad variable assignment earlier, and it was html_file was an integer instead of a string. Fixing that fixed the issue! But yes, the Javascript for the images didn't write into the PDF, so I'll have to find another approach.

Thank you so much!