phax / ph-ubl

Java library for reading and writing UBL 2.0, 2.1, 2.2, 2.3 and 2.4 documents
Apache License 2.0
110 stars 40 forks source link

EmbeddedDocumentBinaryObjectType value only Bytes() can't be String #48

Closed ssdf34 closed 2 years ago

ssdf34 commented 2 years ago

//input String VTL_Base64="ATjZhdik2LPYs9ipINi52KjYr9in2YTZhNmHINit2LPZhiDYp9mE2YbYp9i12LEg2YTZhNiw2YfYqAIOMzAwMTg4ODY1MDAwMDMDEzIwMjItMDQtMDhUMjM6MzI6MTYEBzEwOTcuMDAFBjE0My4wOQ=="; //-------------------------------- DocumentReferenceType drt3 = new DocumentReferenceType(); drt3.setID ("QR"); AttachmentType atype3= new AttachmentType(); EmbeddedDocumentBinaryObjectType Emdb3 = new EmbeddedDocumentBinaryObjectType(); Emdb3.setMimeCode("text/plain");
Emdb3.setValue(VTL_Base64.getBytes()); //output is Something else not like input. Is there a solution to this problem?

phax commented 2 years ago

You don't need to Base64 encode yourself. In the code you provide the source binary content, and the Base64 encoding happens automatically in the background

ssdf34 commented 2 years ago

does Base64 encode support StandardCharsets.UTF_8 because I encode Arabic char to hex then I encode it to Base64 UTF_8

ssdf34 commented 2 years ago

It's suppose when I enter 012a416264756c6c61682048617373616e20416c2d4e617373657220476f6c6420436f72706f726174696f6e020F3331303132323339333530303030330314323032322d30342d32355431353a33303a30305a0407313030302e303005063135302e3030 it gave me ASpBYmR1bGxhaCBIYXNzYW4gQWwtTmFzc2VyIEdvbGQgQ29ycG9yYXRpb24CDzMxMDEyMjM5MzUwMDAwMwMUMjAyMi0wNC0yNVQxNTozMDowMFoEBzEwMDAuMDAFBjE1MC4wMA==

But unfortunately it gave me another result

ssdf34 commented 2 years ago

thinks I solve the issue

String originalInput="012a416264756c6c61682048617373616e20416c2d4e617373657220476f6c6420436f72706f726174696f6e020F3331303132323339333530303030330314323032322d30342d32355431353a33303a30305a0407313030302e303005063135302e3030"; HexBinaryAdapter adapter = new HexBinaryAdapter(); byte[] bytes = adapter.unmarshal(originalInput); Emdb3.setValue(bytes );

the output is matched ASpBYmR1bGxhaCBIYXNzYW4gQWwtTmFzc2VyIEdvbGQgQ29ycG9yYXRpb24CDzMxMDEyMjM5MzUwMDAwMwMUMjAyMi0wNC0yNVQxNTozMDowMFoEBzEwMDAuMDAFBjE1MC4wMA==

phax commented 2 years ago

Okay if the original String is already encoded (as in hex encoding in your example), then this needs to be considered separately. This is independent of Base 64 :)

Assume I have a String that contains German Umlauts, I need to explicitly use the character set when calling getBytes() as in this (untested) example:

String s = "Test äöü";
byte[] b = s.getBytes (StandardCharsets.UTF_8);
EmbeddedDocumentBinaryObjectType Emdb3 = new EmbeddedDocumentBinaryObjectType();
Emdb3.setMimeCode("text/plain");
Emdb3.setValue(b);

How you get your raw "byte[]" is up to you and you need to know what to do, And the difference between char and byte might be confusing when dealing with these issues for the first time.....