XLSForm / pyxform

A Python package to create XForms for ODK Collect.
BSD 2-Clause "Simplified" License
77 stars 134 forks source link

fix: remove java version check #540

Closed lindsay-stevens closed 3 years ago

lindsay-stevens commented 3 years ago

Closes #481

Why is this the best possible solution? Were any other approaches considered?

For some years there was a Look Before You Leap check which parsed java -version output. However that is a maintenance burden. Java versions and version specification formats frequently change. The project documentation already specifies that Java 8 or newer is required. So it should not be surprising if pyxform fails on earlier Java versions.

What are the regression risks?

Users on systems with older java versions would see the older, perhaps less nice / clear error as shown in #257. If the problem is that system java is old but newer alternatives installed, these users could try running pyxform with an overridden PATH pointing to that java version e.g. PATH=/usr/lib/jvm/jre-11/bin:$PATH java -version or PATH=/usr/lib/jvm/jre-11/bin:$PATH pyxform ...

Does this change require updates to documentation?

No, the java version requirement is already in the readme.

Before submitting this PR, please make sure you have:

codecov-commenter commented 3 years ago

Codecov Report

Merging #540 (b491ed0) into master (d054d38) will decrease coverage by 0.10%. The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #540      +/-   ##
==========================================
- Coverage   84.00%   83.90%   -0.11%     
==========================================
  Files          25       25              
  Lines        3732     3733       +1     
  Branches      872      871       -1     
==========================================
- Hits         3135     3132       -3     
- Misses        452      454       +2     
- Partials      145      147       +2     
Impacted Files Coverage Δ
pyxform/validators/enketo_validate/__init__.py 34.21% <50.00%> (-0.08%) :arrow_down:
pyxform/validators/odk_validate/__init__.py 66.66% <57.14%> (-8.81%) :arrow_down:
pyxform/validators/util.py 77.45% <100.00%> (+1.40%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d054d38...b491ed0. Read the comment docs.

lognaturel commented 3 years ago

Huh, is that all that's needed? I hadn't looked into it when I made that comment but my expectation was that at the very least there would have to be some new error messages introduced. What happens with this change when there's no java at all or java 7? Is there a sensible error produced by an existing code path? Or is Validate just skipped?

lindsay-stevens commented 3 years ago

The check_xform func goes on to call _call_validator which in turn calls popen with a command line. After that first call, the returncode is already checked - any non-zero code raises an error (or warning). The content of that error/warning is the stderr stream from the call, for example in #257. In a more normal case, stderr would contain messages from ODK Validate.

So for example if there's no java on the system there would be return code != 0 and error raised.

> if which java; then echo "eq 0"; else echo "ne 0"; fi
/usr/bin/java
eq 0
> if which java-not-real; then echo "eq 0"; else echo "ne 0"; fi
/usr/bin/which: no java-not-real in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
ne 0
>  if not-java; then echo "eq 0"; else echo "ne 0"; fi
bash: not-java: command not found...
ne 0

If L37 were changed from java to not-java:

> python ~/repos/pyxform/repo/pyxform/validators/odk_validate/__init__.py test

odk_validate.py
A python wrapper around ODK Validate

Traceback (most recent call last):
  File "/home/lindsay/repos/pyxform/repo/pyxform/validators/odk_validate/__init__.py", line 94, in <module>
    check_xform(sys.argv[1])
  File "/home/lindsay/repos/pyxform/repo/pyxform/validators/odk_validate/__init__.py", line 68, in check_xform
    returncode, timeout, _stdout, stderr = _call_validator(path_to_xform=path_to_xform)
  File "/home/lindsay/repos/pyxform/repo/pyxform/validators/odk_validate/__init__.py", line 36, in _call_validator
    return run_popen_with_timeout(
  File "/home/lindsay/repos/pyxform/repo/pyxform/validators/util.py", line 59, in run_popen_with_timeout
    p = Popen(command, stdin=PIPE, stdout=PIPE, stderr=PIPE, startupinfo=startup_info)
  File "/usr/local/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'not-java'

If we specify Java 7:

> PATH=/usr/lib/jvm/jre-1.7.0/bin:$PATH python ~/repos/pyxform/repo/pyxform/validators/odk_validate/__init__.py test

odk_validate.py
A python wrapper around ODK Validate

Traceback (most recent call last):
  File "/home/lindsay/repos/pyxform/repo/pyxform/validators/odk_validate/__init__.py", line 94, in <module>
    check_xform(sys.argv[1])
  File "/home/lindsay/repos/pyxform/repo/pyxform/validators/odk_validate/__init__.py", line 76, in check_xform
    raise ODKValidateError(
__main__.ODKValidateError: ODK Validate Errors:
Exception in thread "main" java.lang.UnsupportedClassVersionError: org${FormValidator} : Unsupported major.minor version 52.0

Showing a particular message when java is not available or the wrong version goes back to the #481 problem space, since identifying the cause of the failure would require parsing the stderr content.

lognaturel commented 3 years ago

Showing a particular message when java is not available or the wrong version goes back to the #481 problem space

Ideally we could catch cases where Validate couldn't run and show a message like "Something went wrong when trying to run ODK Validate. Please verify you have Java 8+ available. You can also run pyxform with the --skip_validate flag to skip form validation." But I think what you're saying is that because Validate provides its validation errors through exceptions, you can't identify whether there was a validation error or a problem running Validate.

As you say, it's fairly unlikely someone will have Java <8 today but not having Java at all still happens periodically. I think it's worth a little bit of time seeing if there's something that can be done to make the latter case clearer at least. That seems within reach since there's a specific Python Error to catch.

In order to catch the Java version issue, you'd need to capture what's returned from Validate and match on UnsupportedClassVersionError or something like that?

lindsay-stevens commented 3 years ago

As you say, it's fairly unlikely someone will have Java <8 today but not having Java at all still happens periodically. I think it's worth a little bit of time seeing if there's something that can be done to make the latter case clearer at least. That seems within reach since there's a specific Python Error to catch.

The latter case yields FileNotFoundError: [Errno 2] No such file or directory: 'not-java'. I think this is reasonably clear, but otherwise could you please provide a preferred wording?

In order to catch the Java version issue, you'd need to capture what's returned from Validate and match on UnsupportedClassVersionError or something like that?

Yes, but I think this opens up a similar issue. Maybe there's a different error for Java 6/5/4, or OpenJDK says something different, etc. If you'd like it to look for UnsupportedClassVersionError, please let me know, and a preferred error wording.

lognaturel commented 3 years ago

Thanks for talking it through!

I’m convinced about not doing anything for unsupported Java versions.

Let’s do a message for the no-Java case: “Form validation failed because Java 8+ could not be found. Please install Java or run pyxform with the --skip_validate flag." This replaces the more cryptic “pyxform odk validate dependency: java not found"

I agree with you that for someone who knows there’s a Java dependency and has some dev experience, the raw stack trace is clear enough. But for better or worse all kinds of people end up trying to run pyxform from sometimes very surprising environments.

lindsay-stevens commented 3 years ago

Hi @lognaturel , I've added a check for Java being available at all, with a more friendly error message. In the message I added on "add the installed Java to the environment path." since it's a possibility for why Java can't be found. Also added some tidy-up around the underlying Popen call by using a dataclass for the return data, since that seemed a bit clearer than the previous anonymous tuple.

lognaturel commented 3 years ago

👍 On the message and the other changes.

What happens on systems that don’t have which? I don’t think Windows does. That makes Validate impossible to run, right?

I expected running Validate without any preemptive check and catching Python errors from that. But maybe doing java —version instead of which java would be a platform-agnostic alternative that would get this to the finish line without any restructuring?

lindsay-stevens commented 3 years ago

Sorry, I figured that the Appveyor build would catch any Windows issues. I had been running locally only on CentOS. It looks like the Appveyor environment includes the MinGW/Cygwin toolchain which probably puts which on the PATH, so the tests passed OK. I've changed the java check to use shutil.which from the python stdlib which should be cross-platform.

On that note (probably for another ticket) I'm seeing a few test failures when the suite is run on my local Windows 10 machine. The tests were for test_external_csv_instances, test_external_csv_instances_w_choice_filter, and test_external_xml_instances. All the same error, different file names. In this PR I haven't modified code related to these tests, but please let me know if you'd prefer these were fixed before merge.

Error
Traceback (most recent call last):
  File "C:\Users\lindsay\repos\pyxform\repo\pyxform\tests_v1\pyxform_test_case.py", line 202, in assertPyxformXform
    self._run_odk_validate(xml=xml)
  File "C:\Users\lindsay\repos\pyxform\repo\pyxform\tests_v1\pyxform_test_case.py", line 97, in _run_odk_validate
    check_xform(tmp.name)
  File "C:\Users\lindsay\repos\pyxform\repo\pyxform\validators\odk_validate\__init__.py", line 98, in check_xform
    raise ODKValidateError(
pyxform.validators.odk_validate.ODKValidateError: ODK Validate Errors:
java.io.IOException: Access is denied

>> Something broke the parser. See above for a hint.

The following files failed validation:
C:\WINDOWS\Temp\tmp7c5s7g7z.xml

Result: Invalid
lognaturel commented 3 years ago

Thanks so much for iterating on this with me. There are always so many little things to consider even with simple-seeming issues!

It looks like the Appveyor environment includes the MinGW/Cygwin toolchain which probably puts which on the PATH

Ahh, that makes sense. I was also surprised CI didn't catch it!

I'm seeing a few test failures when the suite is run on my local Windows 10 machine.

Would be great to file an issue for another day. 🙏

lindsay-stevens commented 3 years ago

Great, thanks for your help @lognaturel ! I'll add a ticket for those test issues.