ArchiveTeam / wpull

Wget-compatible web downloader and crawler.
GNU General Public License v3.0
545 stars 77 forks source link

Providing an empty string to `--post-data` makes wpull send GET instead of POST requests #454

Open jodizzle opened 4 years ago

jodizzle commented 4 years ago

What I wanted: To send a body-less (Content-Length of zero) POST request.

What I expect: wpull with --post-data "" makes POST requests without any data.

What happened: wpull with --post-data "" makes GET requests.

The command or website causes the problem: wpull --post-data "" https://httpbin.org/post

Operating system: Ubuntu 16.04.4 x64

Python version: 3.5.2

Wpull version: 2.0.3 (specifically, the current develop branch HEAD)

Log/Output:

INFO Fetching ‘https://httpbin.org/post’.
  100.0% [=========================] 178.0 B 0:00:00 -- B/s
INFO Fetched ‘https://httpbin.org/post’: 405 METHOD NOT ALLOWED. Length: 178 [text/html].
INFO FINISHED.
INFO Duration: 0:00:00. Speed: -- B/s.
INFO Downloaded: 0 files, 0.0 B.

The log doesn't directly show it, but writing WARC request records verifies that the request method being used is GET.

For comparison, here's the result of wget --post-data "" https://httpbin.org/post:

Resolving httpbin.org (httpbin.org)... 52.202.132.122, 34.198.151.234
Connecting to httpbin.org (httpbin.org)|52.202.132.122|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 454 [application/json]
Saving to: ‘post’

I'm pretty sure that this function is the culprit: https://github.com/ArchiveTeam/wpull/blob/a4ff4a93f613ce18ad3c515aa3d4f5848a88b98c/wpull/application/tasks/download.py#L358 Note that bool("") is False.