`BufferTest.testCharBufferConstructor` uses random bytes

cyrille-artho commented 1 month ago

The String constructor with an array of bytes as an argument requires a correctly encoded string. Quoting from the API docs: "Constructs a new String by decoding the specified array of bytes using the platform's default charset. [...] The behavior of this constructor when the given bytes are not valid in the default charset is unspecified."

We want to replace the random bytes with two examples, one using UTF-8 and the other one Latin-1. We want to be sure that (a) non-ASCII characters are covered in each test, (b) the complete test string is fully encoded with only valid bytes within the given test array. How to get example strings: Write a small Java program with two example strings, print them as hexadecimal, run that in the regular JVM. (a) at least one multi-byte character in UTF-8; (b) at least one two-byte non-ASCII character from the Latin-1 set.

eklaDFF commented 4 days ago

isReadOnly() returns :

true
false
false
false

https://chatgpt.com/share/fca61611-90ab-489b-b7cd-7ee517739831

cyrille-artho commented 4 days ago

Thanks, case 3 was unexpected for me! I even added additional checks before and after the read operation, and it seems the buffer is read-write throughout its lifetime. Perhaps this is because the buffer has to be written by the I/O library, even if it's not meant to be written to by the recipient (prior to the call to flip)? In any case, it's good that we now know what the return value of isReadOnly should be. Having the right "ground truth" avoids surprises later.

eklaDFF commented 4 days ago

How should I implement this behaviour ?

If we return true in CharBuffer.java is throwing error in arrayOffset() section .

cyrille-artho commented 3 days ago

I think the return values above are correct; the problem with the indices is separate from that. For that, we need a solution that uses the correct values, depending on the string encoding.

eklaDFF commented 3 days ago

It's a bit confusing to me now, what next should I do ?

Is it worth it to share the current status using my forked repo ?

cyrille-artho commented 3 days ago

You can create a branch on your forked repo to share your work in progress.

eklaDFF commented 2 days ago

https://github.com/eklaDFF/jpf-core/tree/BufferTestBRANCH

cyrille-artho commented 2 days ago

Please update your own master branch so it is in sync with git@github.com:javapathfinder/jpf-core.git. This way, we can compare your recent changes more easily.

eklaDFF commented 2 days ago

Updated!

cyrille-artho commented 1 day ago

I still see all the previous changes when trying git diff. However, I can reproduce the three failures in BufferTest. As a first step towards finding the root cause, add more assertions to the code. Right now, we have a failure in JPF_java_lang_String.java:96 at the call to obj.getChars. If you add assertions that print more details (e.g., the value of coder and the byte array in addition to begin, end, length), we can see better what goes wrong. It can also help to add logging, so we see what the data looks like in the successful test.

eklaDFF commented 1 day ago

"I still see all the previous changes when trying git diff." -----> Should I commit all the changes on master* ?

"As a first step towards finding the root cause, add more assertions to the code." -----> adding assertions in BufferTest.java like assertTrue(...) etc in all three test methods ?

"If you add assertions that print more details (e.g., the value of coder and the byte array in addition to begin, end, length), we can see better what goes wrong. It can also help to add logging, so we see what the data looks like in the successful test." -----> Where to write these ?

cyrille-artho commented 1 day ago

What I meant was that you bring your master branch up to date w.r.t. the main repo: git checkout master; git pull git@github.com:javapathfinder/jpf-core.git master
Yes, add such assertions, and make use of the optional "error message" argument; that message shows what went wrong if an assertion failed.
You can trace the values of key variables in the String methods and related code. This way (when using JPF), you see what these values are. It is possible that they diverge from the "ground truth" even before an assertion fails, as assertions usually don't capture 100 % of the details.

eklaDFF commented 7 hours ago

I tried below code to print the values for parameters like srcPos, dstBegin as you asked above in model class String.java:

public void getBytes(byte[] dst, int srcPos, int dstBegin, byte coder, int length){
//      getBytes(srcPos,(srcPos+length),dst,dstBegin);
        System.out.println("Message from String.java[605] getBytes(...)[ coder = " + coder + ", srcPos = " + srcPos + ", dstBegin = " + dstBegin + ", dst[].length = " + dst.length + ", length = " + length);
        if (isLatin1()) {
            StringLatin1.getBytes(value,srcPos,(srcPos+length),dst,dstBegin);
        } else {
            StringUTF16.getBytes(value,srcPos,(srcPos+length),dst,dstBegin);
        }
    }

But when we run the test case, then it gets hanged on. Do not know why .

"Right now, we have a failure in JPF_java_lang_String.java:96 at the call to obj.getChars" -----> do we need to modify for getChars like getBytes() ? Like Below, checking isLatin()

public void getBytes(byte[] dst, int srcPos, int dstBegin, byte coder, int length){
  //    getBytes(srcPos,(srcPos+length),dst,dstBegin);
    if (isLatin1()) {
        StringLatin1.getBytes(value,srcPos,(srcPos+length),dst,dstBegin);
    } else {
        StringUTF16.getBytes(value,srcPos,(srcPos+length),dst,dstBegin);
    }

}

Should we start fixing compiler warning because both issues (#459, #470) might take longer time than we assumed.

javapathfinder / jpf-core

`BufferTest.testCharBufferConstructor` uses random bytes #459