eclipse-ocl / org.eclipse.ocl

Eclipse Public License 2.0
0 stars 0 forks source link

[releng] Change default encoding of the tests #447

Open eclipse-ocl-bot opened 1 month ago

eclipse-ocl-bot commented 1 month ago

| --- | --- | | Bugzilla Link | 291310 | | Status | NEW | | Importance | P3 minor | | Reported | Oct 05, 2009 04:07 EDT | | Modified | May 27, 2011 02:10 EDT | | Version | 1.3.0 | | Blocks | 318368 | | Reporter | Laurent Goubet |

Description

Created attachment 148738 (attachment deleted)\ Reencodes tests in UTF-8

One of the tests consistently fails on Linux (or any machine which default\ charset is not either ISO-8859-1 or CP1252) :\ RegressionTest#test_hebrew_singleQuote_135321() . This is due to this test\ containing non-ASCII characters and being encoded in ISO-8859-1.

As we're supposed to be internationalized, ISO-8859-1 is not the best option we\ have. The attached patch re-encodes the two instances of RegressionTest in\ UTF-8 and adds a new test which sole purpose is to fail if run in any other\ encoding.

I am unsure if that patch can be applied on windows as the patch file would\ probably be read in CP1252 instead of UTF-8. Adolfo or me would gladly apply\ and commit the changes ;).

eclipse-ocl-bot commented 1 month ago

By Laurent Goubet on Oct 05, 2009 04:13

Created attachment 148739 (attachment deleted)\ Reencodes tests in UTF-8 and checks if default encoding is UTF-8 as needed

I wanted to attach both patches at once, it seems bugzilla doesn't accept it.

The previously attached patch adds a new TestCase at the very beginning of both ecore and uml suites to check if the tests are run in UTF-8 as we discussed on bug 287977. I don't really like this approach as the "UTF-8 test" is far away from the actual tests it is needed for. This second patch adds the required "ifs" for the incriminated tests only so that their failures has a better error message to them. This approach will need all new tests presenting encoding problems to be commited with the same "if", but I think this is cleaner. Ultimately, this can allow us to deactivate the tests if they're not run in UTF-8 (we simply need braces for the "else" and to comment out the call to "fail(...)".

Team, which of these would you rather see applied?

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Oct 05, 2009 15:32

Indeed the patch doesn't apply well on Windows. After manually cutting and pasting via Notepad2 the tests pass on UTF-8 and fail on Cp1252 and then pass again on UTF-8 as required.

Patch definitely to be applied from a Unix machine. No problem since it should be the author to keep the project dashboard statistics right.

I think the localised test is adequate, though I would factor out the test into AbstractTestSuite.checkForUTF8Encoding() method that can be called by any test that cares. It doesn't matter if it's missed from some test, just so long as one test checks.

"These tests should be run in UTF-8" would be more helpful as\ "The Preference for the Eclipse Workspace Text file encoding is not set to UTF-8"

+1


If this is the only usage, there is a separate issue of where the justification for these non ASCII quotes comes from. There is absolutely nothing in UML or OCL to justify them and a certain amount to prohibit them. OCL allows only exactly '

Ah, but this is not the only usage; we were falling over lower-casing sigma earlier.

eclipse-ocl-bot commented 1 month ago

By Laurent Goubet on Oct 06, 2009 03:15

Created attachment 148856 (attachment deleted)\ Revised patch - extracts check to superclass

Thanks for checking this Ed, manually copy pasting the patch must have been a real pain.

Indeed, the idea of extracting it to the AbstractTestSuite didn't even cross my mind. I'll do it as I need to re-use that for the problematic tests of bug 287977. I'll also change the error message as you proposed; this seems more informative.

I already have Adolfo's +1 from the discussion on bug 287977, I'll need Alexander's to ensure we know there is a risk for the build :).

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Oct 06, 2009 04:34

As pains go it was pretty painless.

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Oct 06, 2009 10:25

Laurent, I haven't deeply dived into the problem but can we explicitly set the charset info for the RegressionTest.java file?\ I mean just right-click this file in the Navigator, select Properties and change the text encoding at the Resources property page. The file itself won't change but the charset info will be recorded in .settings folder of the project. This should help.\ Sorry if this comment is irrelevant.

eclipse-ocl-bot commented 1 month ago

By Laurent Goubet on Oct 06, 2009 10:41

Alexander,

We can explicitely set the charset info for this file, and that'll probably settle the problem ... for anyone that is within Eclipse. Standalone, this is not possible.

I can change the charset of this file to UTF-8 and commit the .settings on CVS along with the specific test "checkEncodingIsUTF8". The main problem is : if we change the file encoding to UTF-8, will the build still pass on the Eclipse build system?

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Oct 06, 2009 11:31

(In reply to comment #6)

Alexander,

We can explicitely set the charset info for this file, and that'll probably settle the problem ... for anyone that is within Eclipse. Standalone, this is not possible.

This setting is used by the compiler. And the compiler must recognize it regardless of whether it is launched from inside Eclipse or in standalone mode. org.eclipse.core.resources plugin (which regulates the setting) should always be present. Otherwise, inter-plugin dependencies and other things won't be built correctly.\ Note that this setting mustn't influence the runtime. IOW the tests should pass regardless of the .settings folder presence in the tests plugin.

I can change the charset of this file to UTF-8 and commit the .settings on CVS along with the specific test "checkEncodingIsUTF8". The main problem is : if we change the file encoding to UTF-8, will the build still pass on the Eclipse build system?

Yes, it must pass in this case. Though I am pretty sure that with the encoding setting, checkEncodingIsUTF8() will always pass.

IOW, +1 for reencoding to UTF-8 accompanied with the charset info setting. In any case, UTF-8 is more neutral than Cp1252.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Oct 06, 2009 15:03

Alex's solution is much better; a shame none of the rest of us knew about it.

Let's just set UTF-8 at the root of all test plugins. Then we can forget about this problem. Of course we still need the patch to change the tests to UTF-8.

We also need a call of checkForUTF8 since it will fail on Windows if build.properties omits .settings, which it currently does.

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Oct 07, 2009 14:49

(In reply to comment #8)

Let's just set UTF-8 at the root of all test plugins. Then we can forget about this problem. Of course we still need the patch to change the tests to UTF-8.

That's even better.

We also need a call of checkForUTF8 since it will fail on Windows if build.properties omits .settings, which it currently does.

Well, initially I thought that the encoding setting influences the .class bytecode so it would work regardless of whether the settings file is included into the deployed jar. But at least the Eclipse Export Deployable Plugin wizard produces the same .class regardless of the encoding setting.

IOW, we should also include the settings file into build.properties. Now I am not that sure but I hope the build procedure won't fail. Otherwise, we'll have to find a workaround for that.

eclipse-ocl-bot commented 1 month ago

By Laurent Goubet on Oct 08, 2009 04:13

Created attachment 149096 changes encoding at the project level

The attached sports all changes from the second patch (checkForUTF8(), reencoding of the java files) and alters all of the OCL plugins to set their default encoding to UTF-8. Tests now pass whatever the value of the workspace-level encoding. I couldn't find any way for the export as -> deployable plugins to export classes with the needed encoding though :/.

Adding "compilerArg = -encoding UTF-8" in the build.properties didn't do the trick, neither did adding "javacDefaultEncoding.. = UTF-8". If you can't see a way for this export to work, I believe it would be best to commit this patch as is, try and run a nightly build on the eclipse server and see if the tests pass.

:notepad_spiral: OCLTestsEncoding.patch

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Nov 12, 2009 02:59

Alex: Following the successful-ish M3 build, is this now resolved?

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Nov 12, 2009 05:59

(In reply to comment #11)

Alex: Following the successful-ish M3 build, is this now resolved?

As I see it the patch is not yet committed. I am +1 to committing it and verifying that it works.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Nov 12, 2009 15:27

Alex: I think that we've all +1'd subject to your experimentation to check that it does not impact the builds. So this is for you to commit when you're happy.

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Nov 12, 2009 15:39

(In reply to comment #13)

Alex: I think that we've all +1'd subject to your experimentation to check that it does not impact the builds. So this is for you to commit when you're happy.

Well, I thought Laurent would like to commit his own patch by himself;-)\ But for sure, I can do it too =))

eclipse-ocl-bot commented 1 month ago

By Laurent Goubet on Nov 16, 2009 03:36

Ed, Alex,

I haven't been working on OCL for quite some time, thus I indeed haven't commited this. Should I do it now or have you done it youself Alex?

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Nov 16, 2009 05:41

(In reply to comment #15)

Ed, Alex,

I haven't been working on OCL for quite some time, thus I indeed haven't commited this. Should I do it now or have you done it youself Alex?

I haven't yet committed the patch. Laurent, I am ok if you commit it.

eclipse-ocl-bot commented 1 month ago

By Laurent Goubet on Nov 16, 2009 09:29

I have commited this on head. I switched laptops recently and haven't had time to re-install my debian on it though, so I commited from windows. Hope this hasn't broken anything.

I think it'd be best to launch a nightly build so as to check that everything's okay with this patch; leaving the bug open till it's been verified.

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Nov 17, 2009 03:35

Hi Laurent!

We got failed tests. See https://build.eclipse.org/hudson/job/cbi-mdt-ocl-3.0/56/\ Note that you can launch the nightly builds on your own after logging on the Hudson build page.

Cheers,

eclipse-ocl-bot commented 1 month ago

By Laurent Goubet on Nov 17, 2009 04:27

Alex,

IIRC, 19 of these failures are UML specific and under investigation (can't seem to find the bug/mail where I saw this). The remaining 2 are indeed the two "RegressionTest.test_hebrew_singleQuote_135321" which are the two "encoding-specific" tests.

This would mean our build process builds in either ISO-8859-1 or CP1252; either way not UTF-8 which is what I feared back in comment #c6 . Our options are :

1 - revert back to latin 1 and force linux users to switch to it from UTF-8. I am not keen on this as we should be internationalized and latin 1 is far from a good choice for this.\ 2 - disable these two specific tests when encoding of the running Eclipse is different than UTF-8.\ 3 - try and find a way to force the build machine to build OCL in UTF-8 instead of its default.

I can handle 1 and 2 if our choice goes to on of them, 3 is something I don't know how to do. Nick might be able to help us on this issue if we decide to go for 3. That would be my preferred choice.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Dec 04, 2009 16:12

Although the intent was to enforce UTF-8 on JUnit plug-ins, the commit affected the non-test plug-ins too. Existing files that were non-UTF-8 encoded, notably five files with an a-acute in Sanchez now cause build failures in IBM SDK 5.0 (but not on IBM SDK 6.0, or Sun JDKs).

Since the UTF-8 has now been applied, let's leave it and just fix the non-UTF-8 characters. Since there are currently 5 Sanchez with an acioute and 11 without and Adolfo is happy to lose his acute, I'll simplify the problem files.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Dec 04, 2009 16:34

Problem files committed to HEAD.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Dec 06, 2009 11:45

Despite setting -encoding UTF-8 for the build, the build still fails.

Since the hebrew quote regression tests test something that will be illegal in OCL 2.3, assuming that current 2 in favour noone against ballot succeeds, I see no point keeping these breaking tests. Therefore the non-ASCII character tests are commented, pending a rewrite and a bug fix to force UTF-8 during the build.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Dec 06, 2009 11:55

Despite setting -encoding UTF-8 for the build, the build still fails.

Since the hebrew quote regression tests test something that will be illegal in OCL 2.3, assuming that current 2 in favour noone against ballot succeeds, I see no point keeping these breaking tests. Therefore the non-ASCII character tests are commented, pending a rewrite and a bug fix to force UTF-8 during the build.

The affected tests are the two variants of RegrssionTest.test_hebrew_singleQuote_135321.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Dec 07, 2009 01:44

The longer diagnostic in checkUTFEncoding reported in Build #67 that acute (\u00B4) was compiled as 0xC2 0xB4 which is the UTF-8 byte sequence for (\u00B4). CVS fetch ok. Just need the build compiler to use the -encoding UTF-8.

eclipse-ocl-bot commented 1 month ago

By Laurent Goubet on Dec 07, 2009 03:45

Thanks for tackling this Ed, seems like I didn't pay enough attention when committing as I didn't notice the acute characters in Adolfo's name. As for forcing the build to use UTF-8, that was the issue I feared we'd encounter. The potential workaround I tested back before comment #c10 weren't enough to enforce UTF-8 builds. Nick might have some info?

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Dec 22, 2009 05:42

Hi all,

Since the illegal quote test is commented out, do we still want to keep this bug open?

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Dec 22, 2009 06:00

Yes. I reassigned it to releng, and now to minor.

Ideally we would find out how to make the

-encoding = UTF-8

work in the build compilerArgs.

It would be much nicer to write Unicode tests in Unicode.

eclipse-ocl-bot commented 1 month ago

By Alexander Igdalov on Dec 22, 2009 06:11

(In reply to comment #27)

Ideally we would find out how to make the

-encoding = UTF-8

work in the build compilerArgs.

It would be much nicer to write Unicode tests in Unicode.

Ok, agreed. Adding Nick to CC, perhaps he can help us with that.

eclipse-ocl-bot commented 1 month ago

By Nick Boldt on Jan 07, 2010 19:36

(In reply to comment #28)

(In reply to comment #27)

Ideally we would find out how to make the

-encoding = UTF-8

work in the build compilerArgs.

It would be much nicer to write Unicode tests in Unicode.

Ok, agreed. Adding Nick to CC, perhaps he can help us with that.

Can you not just set that as a name=value property in the test plugin(s)' build.properties file?

http://eclipsesource.com/blogs/2009/01/09/tip-encoding-issues-with-plug-ins/

eclipse-ocl-bot commented 1 month ago

By Adolfo Sanchez-Barbudo Herrera on Feb 08, 2011 08:54

After reading all the thread I've tried to play with this issue... I simply uncommented the offending tests, and run the test cases, however the RegressionTest#test_hebrew_singleQuote_135321() test fails in my workspace.... If I'm not wrong I had understood that test cases should pass in any local workspaces regardless the workspace encoding, am I missing anything ?

I could try to fix this now that I have some knowledge on how the builds are working...

P.S: Should every new project (there several new example and tests plugins) also set the UTF-8 encoding at project's level.

Regards,\ Adolfo.

eclipse-ocl-bot commented 1 month ago

By Ed Willink on Feb 08, 2011 12:16

Yes, in principle projects should use UTF-8, but in practice it's only necessary for test plugins with non-ASCII trest characters.

I think the problem was that we could get interactive usage working portably but could not force the build system to use UTF-8. New system, new skills, maybe you're clever enough to crack it.

eclipse-ocl-bot commented 1 month ago

By Adolfo Sanchez-Barbudo Herrera on Feb 09, 2011 11:42

Ok, but I thought that using the current setup (after different commits done during this bugzilla) should at least make the offending test work in my space...

Well, I think I won't spend time on this bug by the moment, I'm currently working on having Core and Tools different builds.

Cheers,\ Adolfo.