Closed ghost closed 3 years ago
Dear @DeordD
You are right this resource is now redirected to https.
This issue has been fixed already with the last release. So if you are not using the latest docker package we recommend you to do so.
Please let us know if this solves the issue.
Regards
Dear @danielnavarrogeo, we are deploying the war from 2020.1.2. I made the afford of comparing (md5sum) the war files since v2020.1: the packaged war file is the same since then (even the modification date after unzipping is the same: Mar 18 09:02).
md5sum 2020.1/validator.war 3746c958d032c0e348ea90f17424114b 2020.1/validator.war md5sum 2020.1.1/validator.war 3746c958d032c0e348ea90f17424114b 2020.1.1/validator.war md5sum 2020.1.2/validator.war 3746c958d032c0e348ea90f17424114b 2020.1.2/validator.war
I redownloaded 2020.1 and 2020.1.2 again to exclude mistakes on my side.
Regards, Alex
Dear @lglref32team2 The handling of redirections from HTTP to HTTPS for the INSPIRE registry is performed in the Docker image, but not in the .war file. We established in the Docker image a proxy to redirect the INSPIRE requests, given that it is not a good practice to let client applications to follow redirections.
We would recommend for now to either set up a proxy, use the pre-built image on the release, or build your own image using the Dockerfile and other resources that you can find on inspire-validator zipfile.
Hi @carlospzurita , @danielnavarrogeo ,
I will give my feedback on this after discussion in #319 is closed. It should be clear if redirect from HTTP to HTTPS for INSPIRE domain will remain stable and works respectively to TGs.
Hi @carlospzurita , @danielnavarrogeo
after comments from @MarcoMinghini here situation is much clearer now.
In order to have all posible options to solve this problem (some of them are stated here) - Is it possible to solve this problem (validation results different with and without proxy) only on ets-repository side without setting up a proxy?
I am aware this is not an optimal way to solve this, but it would be great to have complete picture.
Thanks in advance!
Dear @DeordD
For now, there is no foreseen change in the ETS side for the HTTPS redirection. Given that most of the URLs are not pre-processed and all requests are handled using the underlying HTTPClient library, this would put an excessive complexity in the tests to add a check in every possible place that could contain an INSPIRE URI.
So the remaining options are to modify the services accordingly, setup the proxy in your own environment, or base your environment on the released Docker image.
I'm using a slightly modified version of the official docker image. When I request the file via curl http://localhost/metadata-codelist/SpatialDataServiceCategory/SpatialDataServiceCategory.en.xml
from within the container I get:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
The proxy server could not handle the request<p>Reason: <strong>Error during SSL Handshake with remote server</strong></p><p />
<hr>
<address>Apache/2.4.10 (Debian) Server at localhost Port 80</address>
</body></html>
Is the official docker image not based on the included dockerfile?
Dear @hwbllmnn
The docker image published in the production instance is based on the included Dockerfile (with slight modifications on deployment specifics options). The redirection is setup in the same way.
Please take into account that there some configurations and setup processes for both the Apache server acting as a proxy for redirection, and the service squid3
acting as a cache. The validator will be configured to use the port where squid
is running to send it's request. Then, the cache will use it's own hosts file to send all requests to the INSPIRE domain to the redirection server running at localhost. The Apache server will be configured with a virtual domain to perform the redirection.
Take a look into the file on the res
folder to check this configuration, and how they are used in the docker-entrypoint.sh
Of course, this is something that can be altered in any way you see fit, taking into account that you need to modify the etf-config.properties
file inside the WAR to use the proper HTTP proxy that you may set up.
Hi @carlospzurita ,
I've played around with the proxy some more, especially the SSL options and arrived at the following configuration:
<VirtualHost *:*>
ServerAdmin carlospalma@guadaltel.com
ServerName inspire.ec.europa.eu
ErrorLog /var/log/apache2/inspire.ec.europa.eu-ssl-error_log
CustomLog /var/log/apache2/inspire.ec.europa.eu-ssl-access_log common
SSLProxyEngine On
SSLProxyCheckPeerName off
SSLProxyVerify none
SSLProxyCheckPeerCN off
ProxyPreserveHost On
DocumentRoot /var/www/html/
ProxyPass / https://inspire.ec.europa.eu/
ProxyPassReverse / https://inspire.ec.europa.eu/
</VirtualHost>
However, the server at inspire.ec.europa.eu delivers a 403 now:
root@86cc17f65c17:/var/lib/jetty# curl http://localhost/metadata-codelist/SpatialDataServiceCategory/SpatialDataServiceCategory.en.xml -D -
HTTP/1.1 403 Forbidden
Date: Wed, 12 Aug 2020 08:54:19 GMT
Server: Apache
X-Frame-Options: SAMEORIGIN
Content-Type: text/html; charset=iso-8859-1
Transfer-Encoding: chunked
Content-Language: en
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /metadata-codelist/SpatialDataServiceCategory/SpatialDataServiceCategory.en.xml
on this server.</p>
</body></html>
After enabling trace8 logging in the apache config, I can see that the request is properly made via SSL. But it's the server that actually sends the 403. I assume I'll have to set a header to send with the proxy request in order to get the proxy to work?
Dear @hwbllmnn
Please take into account that we are also using squid3 as a service to cache the schemas, and that service is handling the requests to the INSPIRE registry. Have you checked the configuration being done in the docker-entrypoint.sh
file?
cp /etc/hosts /etc/squid_hosts
echo "127.0.0.1 inspire.ec.europa.eu" >> /etc/squid_hosts
service apache2 start
a2enmod ssl
a2enmod proxy
a2enmod rewrite
a2enmod proxy_http
a2dissite 000-default
a2ensite proxy.conf
service apache2 reload
rm -rf /var/spool/squid3/*
service squid3 start
We are using an additional hosts
file to be used by squid, redirecting all calls to inspire.ec.europa.eu
as added in the second line in the script above. We are keeping this separated from the /etc/hosts/
file to not mess with other operations from the operative system, and only perform this redirection for the validator. This is configured in the file squid.conf
, where the caching is activated as well.
Dear @hwbllmnn
Did you have any success applying the latest comments on this issue? If you need anything else, please let us know.
Hi @carlospzurita ,
unfortunately, no. We're using an unmodified version of the docker-entrypoint.sh
, the only changes to the dockerfile are that we pre-download the scripts and add a few custom ones.
After that we're getting the SSL error from above. After applying the changes to the apache config from here we still get the 403.
Please @hwbllmnn , is it possible for you to send us the current status of your installation? All the files involved: Dockerfile, entrypoint, server configurations... It may be better for us to setup everything on our premises and work in that deployment with full information and room to modify things.
Hi @carlospzurita ,
sure, here it comes:
You'll have to remove the two lines copying in extra scripts near the end of the Dockerfile as I probably cannot give those away freely (and they won't have anything to do with the proxy). Make sure you have the current validator.war
next to the Dockerfile when building the image.
Note that I inlcuded the above changes to the apache proxy.conf
, so you'll probably get the access denied error from above. Thank you for looking into this, it's much appreciated!
Dear @hwbllmnn
We have been checking the files that you sent us. The docker image was built just fine, and the container was able to start without any issue. All the modifications that you made on the proxy.conf file only affects what it is running inside the container; that is, it has no relation on how the container communicates with the rest of the network the host machine is connected to.
The 403 that you are getting from the cURL happens because the setup on the container is not intended to be used as it is. If you don't update your /etc/hosts file, any request to localhost would not go through the Apache running under the virtual domain declared in the proxy.conf. But modifying this file has an effect on all the requests from inside this container, not only the ones from the validator.
To handle this setup, and also to not interfere with any other requests, we set an alternative hosts file on the first lines on the docker-entrypoint.sh
cp /etc/hosts /etc/squid_hosts
echo "127.0.0.1 inspire.ec.europa.eu" >> /etc/squid_hosts
This file is then used by squid, the caching system for the schemas. This is configured in the file squid.conf, in the section
# TAG: hosts_file
# Location of the host-local IP name-address associations
# database. Most Operating Systems have such a file on different
# default locations:
# - Un*X & Linux: /etc/hosts
# - Windows NT/2000: %SystemRoot%\system32\drivers\etc\hosts
# (%SystemRoot% value install default is c:\winnt)
# - Windows XP/2003: %SystemRoot%\system32\drivers\etc\hosts
# (%SystemRoot% value install default is c:\windows)
# - Windows 9x/Me: %windir%\hosts
# (%windir% value is usually c:\windows)
# - Cygwin: /etc/hosts
#
# The file contains newline-separated definitions, in the
# form ip_address_in_dotted_form name [name ...] names are
# whitespace-separated. Lines beginning with an hash (#)
# character are comments.
#
# The file is checked at startup and upon configuration.
# If set to 'none', it won't be checked.
# If append_domain is used, that domain will be added to
# domain-local (i.e. not containing any dot character) host
# definitions.
#Default:
hosts_file /etc/squid_hosts
Then, the validator is configured to use the HTTP port of squid, on this Dockerfile lines. It is important to note that this variables are referring to a host and port inside the container. No host machine or server is being referred here
# Activate HTTP proxy server by setting a host (IP or DNS name).
# Default: "none" for not using a proxy server
ENV HTTP_PROXY_HOST localhost
# HTTP proxy server port. Default 8080. If you are using Squid it is 3128
ENV HTTP_PROXY_PORT 3128
So any request to inspire.ec.europa.eu coming from the validator are sent to this port, where squid will send the request,recognizing by the alternative hosts file to be sent to 127.0.0.1, rerouted to the apache virtual domain to handle the redirection from HTTP to HTTPS, and then sent to the real INSPIRE domain.
In any case, all configurations inside the container won't have any effect on this particular issue, because it is something related to networking of the Docker installation. If you are still getting errors accessing the codelists through the validator, it may be related to a configuration issue of your Docker client. Please check the latest notes on the release, mainly the "Exposing the validator through a proxy" section. Here you would find an explanation on working around proxy issues.
If you need any more feedback or clarification, please contact us.
I'm not sure how this will help me. Since I need to use a corporate proxy, that one needs to be configured, so the proxy on localhost
inside the container will not be used anyway?
Apart from that, I still get that 403
when requesting the Apache reverse proxy inside the container, so even if it would be used the codelists would not be available.
The proxy on localhost inside the container will be used by the ETF to cache requests for external resources, and handle the redirection on the INSPIRE registry.
You need to configure you Docker client, that is, the installation of Docker in your machine, to use the corporate proxy and give access to the container. That is the resources on the release notes refer to, and you would need to apply on your configuration.
The 403
code from inside the container will persist always if you are using a tool as cURL to perform the requests. As explained in my last comment, there is a special configuration for the cache system (squid3) that is using an alternate hosts
file. This hosts file changes the location of the domain inspire.ec.europa.e
u to 127.0.0.1
, where the internal Apache is running. In doing so, the virtual domain set on Apache will receive communications from squid3 and from the validator. Any other request pointing directly to localhost
will not have any result, as the Apache server is not configured to run under that alias
I hope that this diagram may clarify this. The red arrow is the configuration bit that is explained in the section "Exposing the validator through a proxy" on the release page.
Ok, thanks for the clarification. I didn't get the point that we HAVE to configure docker to use the proxy for all external traffic. That unfortunately seems not to be an not an option for us, though (we're running on kubernetes).
Thanks again for staying with us on this issue, it's much appreciated!
Hi @carlospzurita ,
could you please check my summary of discussion in this issue (intention was to do it in non-technical way that people can understand the challenge here):
Feel free to change my summary in any way needed.
Thanks a lot!
Dear @DeordD
I think you have everything covered on your summary for this issue. One thing to point out is that the schema caching solution was already included in the Docker image of version 2020.1. But everything else is correct.
@carlospzurita I have changed it. Thanks a lot!
Dear all,
Thank you very much for your contributions, we hope everything is clear. Please, if you have any other questions or problems, do not hesitate to open another issue.
Best regards.
Following issues have occured during 2021.2 deployment:
Hi,
we are using v2020.1.2 branch in our environment to check metadata requirements. The test "md sds 3.4" fails because of the following error:
It seems that resource http://inspire.ec.europa.eu/metadata-codelist/SpatialDataServiceCategory/SpatialDataServiceCategory.en.xml has been redirected to https.
The same test passes in INSPIRE Validator production instance (resources cached?).
Any advice what could be the problem here?
Thanks in advance.
UPDATE 23.11.2020
Following aspects were discussed in this issue so far:
UPDATE 28.06.2021
Following aspects have been documented in
HTTPs caching
Schema caching solution unstable