Open ipfssleuth opened 2 years ago
Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review. In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment. Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:
Finally, remember to use https://discuss.ipfs.io if you just need general support.
This almost certainly has nothing to do with multiple packets on the wire (unless there's a very strange bug deep in the go HTTP implementation).
How are you making your POST requests?
as per report above the POST requests are both times the ipfs
cli, once with tinyproxy in the middle, which breaks the request into smaller packets. looking at the pcap files the packetization is the only difference I can see so it must be that.
Can you post your config? TinyProxy can do a lot more than break requests up into smaller packets.
But my guess is that this is because go-ipfs starts streaming its response before it reads the entire request. Go's HTTP library has some bugs around this, and I wouldn't be surprised if TinyProxy does to. Specifically, it's probably closing the request body when it starts seeing a response from the server.
tinyproxy config
##
## tinyproxy.conf -- tinyproxy daemon configuration file
##
## This example tinyproxy.conf file contains example settings
## with explanations in comments. For decriptions of all
## parameters, see the tinproxy.conf(5) manual page.
##
#
# User/Group: This allows you to set the user and group that will be
# used for tinyproxy after the initial binding to the port has been done
# as the root user. Either the user or group name or the UID or GID
# number may be used.
#
User tinyproxy
Group tinyproxy
#
# Port: Specify the port which tinyproxy will listen on. Please note
# that should you choose to run on a port lower than 1024 you will need
# to start tinyproxy using root.
#
Port 8080
#
# Listen: If you have multiple interfaces this allows you to bind to
# only one. If this is commented out, tinyproxy will bind to all
# interfaces present.
#
Listen 192.168.1.1
#
# Bind: This allows you to specify which interface will be used for
# outgoing connections. This is useful for multi-home'd machines where
# you want all traffic to appear outgoing from one particular interface.
#
Bind 192.168.1.1
#
# BindSame: If enabled, tinyproxy will bind the outgoing connection to the
# ip address of the incoming connection.
#
#BindSame yes
#
# Timeout: The maximum number of seconds of inactivity a connection is
# allowed to have before it is closed by tinyproxy.
#
Timeout 600
#
# ErrorFile: Defines the HTML file to send when a given HTTP error
# occurs. You will probably need to customize the location to your
# particular install. The usual locations to check are:
# /usr/local/share/tinyproxy
# /usr/share/tinyproxy
# /etc/tinyproxy
#
#ErrorFile 404 "/usr/share/tinyproxy/404.html"
#ErrorFile 400 "/usr/share/tinyproxy/400.html"
#ErrorFile 503 "/usr/share/tinyproxy/503.html"
#ErrorFile 403 "/usr/share/tinyproxy/403.html"
#ErrorFile 408 "/usr/share/tinyproxy/408.html"
#
# DefaultErrorFile: The HTML file that gets sent if there is no
# HTML file defined with an ErrorFile keyword for the HTTP error
# that has occured.
#
DefaultErrorFile "/usr/share/tinyproxy/default.html"
#
# StatHost: This configures the host name or IP address that is treated
# as the stat host: Whenever a request for this host is received,
# Tinyproxy will return an internal statistics page instead of
# forwarding the request to that host. The default value of StatHost is
# tinyproxy.stats.
#
#StatHost "tinyproxy.stats"
#
#
# StatFile: The HTML file that gets sent when a request is made
# for the stathost. If this file doesn't exist a basic page is
# hardcoded in tinyproxy.
#
StatFile "/usr/share/tinyproxy/stats.html"
#
# LogFile: Allows you to specify the location where information should
# be logged to. If you would prefer to log to syslog, then disable this
# and enable the Syslog directive. These directives are mutually
# exclusive. If neither Syslog nor LogFile are specified, output goes
# to stdout.
#
LogFile "/var/log/tinyproxy/tinyproxy.log"
#
# Syslog: Tell tinyproxy to use syslog instead of a logfile. This
# option must not be enabled if the Logfile directive is being used.
# These two directives are mutually exclusive.
#
#Syslog On
#
# LogLevel: Warning
#
# Set the logging level. Allowed settings are:
# Critical (least verbose)
# Error
# Warning
# Notice
# Connect (to log connections without Info's noise)
# Info (most verbose)
#
# The LogLevel logs from the set level and above. For example, if the
# LogLevel was set to Warning, then all log messages from Warning to
# Critical would be output, but Notice and below would be suppressed.
#
LogLevel Info
#
# PidFile: Write the PID of the main tinyproxy thread to this file so it
# can be used for signalling purposes.
# If not specified, no pidfile will be written.
#
PidFile "/run/tinyproxy/tinyproxy.pid"
#
# XTinyproxy: Tell Tinyproxy to include the X-Tinyproxy header, which
# contains the client's IP address.
#
#XTinyproxy Yes
#
# Upstream:
#
# Turns on upstream proxy support.
#
# The upstream rules allow you to selectively route upstream connections
# based on the host/domain of the site being accessed.
#
# Syntax: upstream type (user:pass@)ip:port ("domain")
# Or: upstream none "domain"
# The parts in parens are optional.
# Possible types are http, socks4, socks5, none
#
# For example:
# # connection to test domain goes through testproxy
# upstream http testproxy:8008 ".test.domain.invalid"
# upstream http testproxy:8008 ".our_testbed.example.com"
# upstream http testproxy:8008 "192.168.128.0/255.255.254.0"
#
# # upstream proxy using basic authentication
# upstream http user:pass@testproxy:8008 ".test.domain.invalid"
#
# # no upstream proxy for internal websites and unqualified hosts
# upstream none ".internal.example.com"
# upstream none "www.example.com"
# upstream none "10.0.0.0/8"
# upstream none "192.168.0.0/255.255.254.0"
# upstream none "."
#
# # connection to these boxes go through their DMZ firewalls
# upstream http cust1_firewall:8008 "testbed_for_cust1"
# upstream http cust2_firewall:8008 "testbed_for_cust2"
#
# # default upstream is internet firewall
# upstream http firewall.internal.example.com:80
#
# You may also use SOCKS4/SOCKS5 upstream proxies:
# upstream socks4 127.0.0.1:9050
# upstream socks5 socksproxy:1080
#
# The LAST matching rule wins the route decision. As you can see, you
# can use a host, or a domain:
# name matches host exactly
# .name matches any host in domain "name"
# . matches any host with no domain (in 'empty' domain)
# IP/bits matches network/mask
# IP/mask matches network/mask
#
#Upstream http some.remote.proxy:port
upstream http 10.20.30.40:8080
upstream none ".mydomain"
upstream none "."
upstream none "192.168.1.0/24"
#
# MaxClients: This is the absolute highest number of threads which will
# be created. In other words, only MaxClients number of clients can be
# connected at the same time.
#
MaxClients 100
#
# MinSpareServers/MaxSpareServers: These settings set the upper and
# lower limit for the number of spare servers which should be available.
#
# If the number of spare servers falls below MinSpareServers then new
# server processes will be spawned. If the number of servers exceeds
# MaxSpareServers then the extras will be killed off.
#
MinSpareServers 5
MaxSpareServers 20
#
# StartServers: The number of servers to start initially.
#
StartServers 10
#
# MaxRequestsPerChild: The number of connections a thread will handle
# before it is killed. In practise this should be set to 0, which
# disables thread reaping. If you do notice problems with memory
# leakage, then set this to something like 10000.
#
MaxRequestsPerChild 0
#
# Allow: Customization of authorization controls. If there are any
# access control keywords then the default action is to DENY. Otherwise,
# the default action is ALLOW.
#
# The order of the controls are important. All incoming connections are
# tested against the controls based on order.
#
#Allow 127.0.0.1
#Allow 192.168.0.0/16
#Allow 172.16.0.0/12
#Allow 10.0.0.0/8
# BasicAuth: HTTP "Basic Authentication" for accessing the proxy.
# If there are any entries specified, access is only granted for authenticated
# users.
#BasicAuth user password
#
# AddHeader: Adds the specified headers to outgoing HTTP requests that
# Tinyproxy makes. Note that this option will not work for HTTPS
# traffic, as Tinyproxy has no control over what headers are exchanged.
#
#AddHeader "X-My-Header" "Powered by Tinyproxy"
#
# ViaProxyName: The "Via" header is required by the HTTP RFC, but using
# the real host name is a security concern. If the following directive
# is enabled, the string supplied will be used as the host name in the
# Via header; otherwise, the server's host name will be used.
#
ViaProxyName "tinyproxy"
#
# DisableViaHeader: When this is set to yes, Tinyproxy does NOT add
# the Via header to the requests. This virtually puts Tinyproxy into
# stealth mode. Note that RFC 2616 requires proxies to set the Via
# header, so by enabling this option, you break compliance.
# Don't disable the Via header unless you know what you are doing...
#
DisableViaHeader Yes
#
# Filter: This allows you to specify the location of the filter file.
#
#Filter "/etc/tinyproxy/filter"
#
# FilterURLs: Filter based on URLs rather than domains.
#
#FilterURLs On
#
# FilterExtended: Use POSIX Extended regular expressions rather than
# basic.
#
#FilterExtended On
#
# FilterCaseSensitive: Use case sensitive regular expressions.
#
#FilterCaseSensitive On
#
# FilterDefaultDeny: Change the default policy of the filtering system.
# If this directive is commented out, or is set to "No" then the default
# policy is to allow everything which is not specifically denied by the
# filter file.
#
# However, by setting this directive to "Yes" the default policy becomes
# to deny everything which is _not_ specifically allowed by the filter
# file.
#
#FilterDefaultDeny Yes
#
# Anonymous: If an Anonymous keyword is present, then anonymous proxying
# is enabled. The headers listed are allowed through, while all others
# are denied. If no Anonymous keyword is present, then all headers are
# allowed through. You must include quotes around the headers.
#
# Most sites require cookies to be enabled for them to work correctly, so
# you will need to allow Cookies through if you access those sites.
#
#Anonymous "Host"
#Anonymous "Authorization"
#Anonymous "Cookie"
#
# ConnectPort: This is a list of ports allowed by tinyproxy when the
# CONNECT method is used. To disable the CONNECT method altogether, set
# the value to 0. If no ConnectPort line is found, all ports are
# allowed.
#
# The following two ports are used by SSL.
#
ConnectPort 443
ConnectPort 563
#
# Configure one or more ReversePath directives to enable reverse proxy
# support. With reverse proxying it's possible to make a number of
# sites appear as if they were part of a single site.
#
# If you uncomment the following two directives and run tinyproxy
# on your own computer at port 8888, you can access Google using
# http://localhost:8888/google/ and Wired News using
# http://localhost:8888/wired/news/. Neither will actually work
# until you uncomment ReverseMagic as they use absolute linking.
#
#ReversePath "/google/" "http://www.google.com/"
#ReversePath "/wired/" "http://www.wired.com/"
#
# When using tinyproxy as a reverse proxy, it is STRONGLY recommended
# that the normal proxy is turned off by uncommenting the next directive.
#
#ReverseOnly Yes
#
# Use a cookie to track reverse proxy mappings. If you need to reverse
# proxy sites which have absolute links you must uncomment this.
#
#ReverseMagic Yes
#
# The URL that's used to access this reverse proxy. The URL is used to
# rewrite HTTP redirects so that they won't escape the proxy. If you
# have a chain of reverse proxies, you'll need to put the outermost
# URL here (the address which the end user types into his/her browser).
#
# If not set then no rewriting occurs.
#
#ReverseBaseURL "http://localhost:8888/"
would "closing the request body when it starts seeing a response" be something we see in the pcap.
You'd see a FIN packet.
Hm. Maybe. That or a zero length chunk.
Let me dig into those pcaps more. Something fishy is definitely going on here.
🐟
Ok, so the only thing I can think of is that somewhere in go-ipfs we're closing the inbound connection early.
This could be due to https://github.com/ipfs/go-ipfs/issues/5168, or it could be something else entirely.
if there's a bug in the HTTP stack then this could have broader implications
Well, if it's related to the issue I posted above, then it's not an issue for most users. It's only an issue when an HTTP server starts responding to a request before reading the entire request.
The strange thing is that we have a workaround for this issue which should "trick" Go's HTTP library into not truncating the body. So I'm not convinced this is the issue.
Encountered the same issue when reverse proxying ipfs rpc behind nginx. Easy to reproduce, here's the nginx config (v1.23), when uploading small size data like 10MB, it worked fine. But it failed when uploading large folders/data.
server {
listen 80;
listen [::]:80;
client_max_body_size 20G;
location / {
proxy_pass http://127.0.0.1:5001/;
# didn't affect the prematured eof, you can even remove the both lines.
proxy_read_timeout 10s;
proxy_buffering off;
}
}
Could anyone tell me how to config the reverse proxy like nginx... , it has blocked me for weeks... QQ update details java http client v1.4.1 nginx v1.23 ipfs v0.17.0
I have the same issue.
Checklist
Installation method
built from source
Version
Config
Description
POST /api/v0/add
will prematurely respond with "Nextpart: EOF" before even the first part has been sent if the HTTP request is split into multiple packets./ipfs/QmXNuNB13zYS6WDAjkq8YG6mPtdni9xUNrcSzX6xufda8Z/broken.pcap
shows a packet dump. note that the HTTP header set comes in two packets, after the second the API responds with the error before the HTTP request even had a chance to send the first part of the multipart request. by comparison/ipfs/QmcNL9Dv9XQ6GBDmHmaHByAhNUe5bNCNPkgpMTaDgqpJrP/works.pcap
is the exact same situation except that all HTTP headers come in one packet.the latter is IPFS command line standard behavior. the former (broken) can e.g. per provoked by inserting an HTTP proxy like tinyproxy in the middle.