firefox-devtools / har-export-trigger

Trigger HAR export any time directly from within a page.
https://addons.mozilla.org/en-US/firefox/addon/har-export-trigger/
Mozilla Public License 2.0
55 stars 16 forks source link

Empty body #31

Open vasinkd opened 5 years ago

vasinkd commented 5 years ago

Steps to reproduce: 1) Install HARExportTrigger add-on for firefox 2) Open devtools network pane 3) Download https://www.google.com/

You will see that the https://www.google.com/ document shows an empty response (but it is obviously not true).

The same thing you will see with geckodriver and selenium: in the generated HAR response for the first request to google.com will only have [{'mimeType': 'text/html; charset=UTF-8'}] in the content field and no text.

After disabling the add-on, the response body is there again.

Tried this with Firefox 61.0.2 and Firefox 63.0.3

vasinkd commented 5 years ago

This bug can be seen only with the latest version of har-export-trigger. With version 0.6.0. everything works just fine.

gaardiolor commented 5 years ago

I also have a similar issue; entry['response']['content']['text'] is missing for Content-Type: application/json;charset=utf-8 . Works in 0.6.0 .

gaardiolor commented 5 years ago

Hmm, it seems to not be bound to a specific Content-Type. I have seen examples of 0.6.1 having ['response']['content']['text'] for Content-Type: application/json;charset=utf-8 But definately 0.6.0 is more reliable. For one page with 135 'hits', 0.6.0 has the ['response']['content']['text'] of 134 of those (the one missing being a redirect), 0..6.1 only has the ['response']['content']['text'] of 106 .

Doing another test, it seems that 0.6.0 is getting the ['response']['content']['text'] of javascript fetched objects, while 0.6.1 is not.

Take this test page:

<html>
 <head>
    <script language='JavaScript1.2' type='text/javascript'>
      fetch('https://www.nasa.gov/robots.txt')
    </script>
 </head>
</html>

This function:

async def test3():
    hartest = SeleniumInstance()
    driver = await hartest.get_driver()
    driver.get('http://<snip>/test.html')
    await asyncio.sleep(5)
    result = await hartest.get_har()
    for entry in result['entries']:
        print(entry['request']['url'])
        print(entry['response']['content'])

0.6.0 has the text of robots.txt fetched by javascript:

http://<snip>/test123.html
{'mimeType': 'text/html; charset=UTF-8', 'size': 142, 'text': "<html>\n <head>\n<script language='JavaScript1.2' type='text/javascript'>\n  fetch('https://www.nasa.gov/robots.txt')\n</script>\n </head>\n</html>\n"}
https://www.nasa.gov/robots.txt
{'mimeType': 'text/plain', 'size': 147, 'text': '# Robots.txt file from http://www.nasa.gov\n#\n# All robots will spider the domain\n\nUser-agent: *\nDisallow: /worldbook/\nDisallow: /offices/oce/llis/\n'}
http://<snip>/favicon.ico
{'mimeType': 'text/html; charset=iso-8859-1', 'size': 209, 'text': '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>404 Not Found</title>\n</head><body>\n<h1>Not Found</h1>\n<p>The requested URL /favicon.ico was not found on this server.</p>\n</body></html>\n'}

0.6.1 not:

http://<snip>/test123.html
{'mimeType': 'text/html; charset=UTF-8', 'size': 142, 'text': "<html>\n <head>\n<script language='JavaScript1.2' type='text/javascript'>\n  fetch('https://www.nasa.gov/robots.txt')\n</script>\n </head>\n</html>\n"}
https://www.nasa.gov/robots.txt
{'mimeType': 'text/plain'}
http://<snip>/favicon.ico
{'mimeType': 'text/html; charset=iso-8859-1', 'size': 209, 'text': '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>404 Not Found</title>\n</head><body>\n<h1>Not Found</h1>\n<p>The requested URL /favicon.ico was not found on this server.</p>\n</body></html>\n'}

Since the only difference between 0.6.0 and 0.6.1 is onAddRequestListener(); discussed in #8 and #16, this apparently kils dynamically fetched content.