Closed mashhurs closed 4 months ago
For future reference, the manticore differentiated behaviour we see is this:
> string_entity = Java::org.apache.http.entity.StringEntity.new("\xAC")
=> #<Java::OrgApacheHttpEntity::StringEntity:0x2f930be7>
> Java::org.apache.http.util.EntityUtils.toString(string_entity)
=> "?"
> byte_array_entity = Java::org.apache.http.entity.ByteArrayEntity.new("\xAC".to_java_bytes)
=> #<Java::OrgApacheHttpEntity::ByteArrayEntity:0x27c243a3>
> Java::org.apache.http.util.EntityUtils.toString(byte_array_entity)
=> "¬"
Travis is failing in the 7.x integration tests because the logs for the job are too verbose, and travis terminates the build when the logs go over the job's maximum length:
The job exceeded the maximum log length, and has been terminated.
-- Job Output
I have run both such dockerized jobs locally and they are green (successful).
Following CI jobs failed but I do confirm they are passing on my local, should be related to travis. So, they are not blockers.
2733.3 | INTEGRATION=true ELASTIC_STACK_VERSION=7.x | Linux | errored 2733.4 | INTEGRATION=true ELASTIC_STACK_VERSION=7.x SNAPSHOT=true LOG_LEVEL=info | Linux | errored
env vars
INTEGRATION=true
ELASTIC_STACK_VERSION=7.x
SNAPSHOT=true # without snapshot also fine
LOG_LEVEL=info
docker setup: ./.ci/docker-setup.sh
Fetching versions from https://raw.githubusercontent.com/elastic/logstash/master/ci/logstash_releases.json
"7.17.19-SNAPSHOT"
Translated 7.x to 7.17.19-SNAPSHOT
Testing against version: 7.17.19-SNAPSHOT
Pulling docker.elastic.co/logstash/logstash:7.17.19-SNAPSHOT
7.17.19-SNAPSHOT: Pulling from logstash/logstash
Digest: sha256:e1b7f8e923565d3b5307637d5a984d48ea82802cd110433ab03957e544711b56
Status: Image is up to date for docker.elastic.co/logstash/logstash:7.17.19-SNAPSHOT
docker.elastic.co/logstash/logstash:7.17.19-SNAPSHOT
Pulling docker.elastic.co/elasticsearch/elasticsearch:7.17.19-SNAPSHOT
...
=> => writing image sha256:3dcbb6d89a3a6f8e97439f40036897a00f9b43f3066b15d5327a6cbf15fac15f 0.0s
[+] Building 8.0s (18/18) FINISHED
docker run: ./.ci/docker-run.sh
w0w" }
ci-logstash-1 | increases number of successful inserted documents
ci-logstash-1 |
ci-logstash-1 | Pending: (Failures listed here are expected and do not affect your suite's status)
ci-logstash-1 |
ci-logstash-1 | 1) pool sniffer Simple sniff parsing with single node should return the correct sniff URL
ci-logstash-1 | # No reason given
ci-logstash-1 | # ./spec/integration/outputs/sniffer_spec.rb:37
ci-logstash-1 |
ci-logstash-1 |
ci-logstash-1 | Finished in 5 minutes 21 seconds (files took 4.22 seconds to load)
ci-logstash-1 | 136 examples, 0 failures, 1 pending
ci-logstash-1 |
ci-logstash-1 | Randomized with seed 65243
ci-logstash-1 |
ci-logstash-1 exited with code 0
Aborting on container exit...
[+] Running 2/2
✔ Container ci-elasticsearch-1 Stopped 0.3s
✔ Container ci-logstash-1 Stopped
Description
Current buggy behaviours:
compression_level > 0
), the event get rejected if it has invalid non UTF-8 byte sequences;compression_level = 0
), the event get accepted even though it has invalid non UTF-8 byte sequences. The reason behind,manticore
HTTP client under the hood replaces them (1-byte with 3-bytes, 2 extra bytes appear can be checked in apache trace logs) when it uses the apacheStringEntity
This PR introduces an immediate fix and opens a discussion for long general term use case. Tested with apache client trace logs that sending bytes do not change.
\uFFFD
). The idea is from the best practise point of view (how most of current S/W programs behave, example editors) and also provides a benefit of utilizing the event (as much as possible valid parts) instead of throwing.for long term (requires a discussion)see the comment https://github.com/logstash-plugins/logstash-output-elasticsearch/pull/1169#pullrequestreview-1944161216manticore
HTTP client has a logic where if request body is given, it either usesByteArrayEntity
or (apache's common core)StringEntity
. SinceStringEntity
's behaviour to convert the payload, the original bytes will change if invalid UTF-8. From my point of view, themanticore
shouldn't align on any conversion regardless of any encoding and useByteArrayEntity
. No idea what feature/behaviormanticore
was going to provide withStringEntity