Closed robertwb closed 11 years ago
Replying to @jdemeyer:
What I'm missing in this patchbomb is a high-level overview of how everything works. To me, it looks like a large lump of code (which seems to work pretty well) which is hard to understand.
I will add somw top level documentation next week.
Replying to @jdemeyer:
Do you guys consider it a feature or a bug that
""" sage: 1e16 # relative tol 1e-10 10^16 """
doesn't work? See #12815.
Feature. See new comment on that ticket.
Looks like I have to depend on this for the numpy-1.7 upgrade at #11334, so I am tagging along :)
It is interesting. With each new version of numpy we seem to have new instances of
Warning: divide by zero encountered in divide
appearing in doctests. This suggest that numpy sometimes mixes stderr and stdout or that they are very intent to tell you about these warnings. All the instances of this warning that are fixed in attachment: 12415_stderr_vs_51b5.patch also appears with the upgrade to numpy 1.7.0 - plus some more.
Combined attachment: 12415_stderr_vs_51b5.patch and attachment: 12415_doctest_fixes.patch and rebased.
attachment: 12415_framework.patch needs to be rebased but displayhook.py
changed substantially in #12719, so I don't know what to do.
Description changed:
---
+++
@@ -12,7 +12,7 @@
and does most of the above (plus more which has been moved to followup tickets #12720 and #12722)
-**Apply** [attachment: 12415_stderr_vs_51b5.patch](https://github.com/sagemath/sage/files/ticket12415/12415_stderr_vs_51b5.patch.gz), [attachment: 12415_framework.patch](https://github.com/sagemath/sage-prod/files/10654698/12415_framework.patch.gz), [attachment: 12415_doctest_fixes.patch](https://github.com/sagemath/sage-prod/files/10654706/12415_doctest_fixes.patch.gz), [attachment: 12415_doc.patch](https://github.com/sagemath/sage-prod/files/10654702/12415_doc.patch.gz)
+**Apply** [attachment: 12415_framework.patch](https://github.com/sagemath/sage-prod/files/10654698/12415_framework.patch.gz), [attachment: 12415_doctest_fixes.patch](https://github.com/sagemath/sage-prod/files/10654706/12415_doctest_fixes.patch.gz), [attachment: 12415_doc.patch](https://github.com/sagemath/sage-prod/files/10654702/12415_doc.patch.gz)
**Apply** [attachment: 12415_script.patch](https://github.com/sagemath/sage-prod/files/10654697/12415_script.patch.gz) to the scripts repo
Attachment: 12415_spkg_bin_sage.patch.gz
Apply to root repository
Changed dependencies from #13147,#13146, #13145, #12723, #12392, #12393, #12395, #12396, #12397, #12381, #12382, #12383, #12384, #11871, #13195, #13121, #13748, #13899 to #13147,#13146, #13145, #12723, #12392, #12393, #12395, #12396, #12397, #12381, #12382, #12383, #12384, #11871, #13195, #13121, #13748, #13899, #12719, #5155
Patch to script repo; now removes old files
Attachment: 12415_script.patch.gz
Is this because of the displayhook?
File "devel/sage/sage/combinat/cluster_algebra_quiver/quiver.py", line 996, in sage.combinat.cluster_algebra_quiver.quiver.ClusterQuiver.mutation_sequence
Failed example:
[T.b_matrix() for T in seq]
Expected:
[
[ 0 1 0 0] [ 0 -1 0 0] [ 0 1 -1 0]
[-1 0 -1 0] [ 1 0 -1 0] [-1 0 1 0]
[ 0 1 0 1] [ 0 1 0 1] [ 1 -1 0 1]
[ 0 0 -1 0], [ 0 0 -1 0], [ 0 0 -1 0]
]
Got:
[[ 0 1 0 0]
[-1 0 -1 0]
[ 0 1 0 1]
[ 0 0 -1 0], [ 0 -1 0 0]
[ 1 0 -1 0]
[ 0 1 0 1]
[ 0 0 -1 0], [ 0 1 -1 0]
[-1 0 1 0]
[ 1 -1 0 1]
[ 0 0 -1 0]]
Replying to @jdemeyer:
Is this because of the displayhook?
File "devel/sage/sage/combinat/cluster_algebra_quiver/quiver.py", line 996, in sage.combinat.cluster_algebra_quiver.quiver.ClusterQuiver.mutation_sequence Failed example: [T.b_matrix() for T in seq] Expected: [ [ 0 1 0 0] [ 0 -1 0 0] [ 0 1 -1 0] [-1 0 -1 0] [ 1 0 -1 0] [-1 0 1 0] [ 0 1 0 1] [ 0 1 0 1] [ 1 -1 0 1] [ 0 0 -1 0], [ 0 0 -1 0], [ 0 0 -1 0] ] Got: [[ 0 1 0 0] [-1 0 -1 0] [ 0 1 0 1] [ 0 0 -1 0], [ 0 -1 0 0] [ 1 0 -1 0] [ 0 1 0 1] [ 0 0 -1 0], [ 0 1 -1 0] [-1 0 1 0] [ 1 -1 0 1] [ 0 0 -1 0]]
Yeah, that's the kind of error that would occur because of displayhook. I'll take a look.
All other files (except attachment: 12415_framework.patch) have been rebased to sage-5.7.beta1.
Great. I'm building a copy of sage-5.7.beta1 now and will try to rebase attachment: 12415_framework.patch to it later today.
Replying to @roed314:
Great. I'm building a copy of sage-5.7.beta1 now and will try to rebase attachment: 12415_framework.patch to it later today.
That would be great!
I would really like to finish this ticket. I feel that we're almost there, the main thing I would like to change is the handling of subprocesses (including but not limited to killing).
The new doctesting code
Attachment: 12415_framework.patch.gz
I think I've fixed the display hook problem. I just ran make ptestlong and found some errors; I'm looking into them and will report back.
I'm working on some general documentation, but some help on the annihilate
method would be useful, since I know nothing about process control groups.
There are still doctest failures in the doctest framework, see comment [comment:140].
More doctest framework doctest failures:
sage -t --long devel/sage/sage/doctest/control.py
**********************************************************************
File "devel/sage/sage/doctest/control.py", line 566, in sage.doctest.control.DocTestController.cleanup
Failed example:
DC.run()
Exception raised:
Traceback (most recent call last):
File "/release/sage-5.7.beta3-doctest/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 595, in _run
self.execute(example, compiled, test.globs)
File "/release/sage-5.7.beta3-doctest/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 980, in execute
exec compiled in globs
File "<doctest sage.doctest.control.DocTestController.cleanup[8]>", line 1, in <module>
DC.run()
File "/release/sage-5.7.beta3-doctest/local/lib/python2.7/site-packages/sage/doctest/control.py", line 725, in run
self.test_safe_directory()
File "/release/sage-5.7.beta3-doctest/local/lib/python2.7/site-packages/sage/doctest/control.py", line 270, in test_safe_directory
.format(os.getcwd()))
RuntimeError: refusing to run doctests from the current directory '/home/jdemeyer/.sage/temp/sage.math.washington.edu/10273/dir_2B_1k4/test' since untrusted users could put files in this directory, making it unsafe to run Sage code from
**********************************************************************
Then there is a bunch of
/release/sage-5.7.beta3-doctest/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py:229: UnicodeWarning: Unicode equal comparis
on failed to convert both arguments to Unicode - interpreting them as being unequal
if code[q-1] == u'\\':
in between the tests, for example:
sage -t --long devel/sage/sage/combinat/species/empty_species.py
[38 tests, 0.1 s]
sage -t --long devel/sage/sage/combinat/sf/kfpoly.py
[60 tests, 0.1 s]
sage -t --long devel/sage/sage/combinat/words/shuffle_product.py
[63 tests, 0.1 s]
/release/sage-5.7.beta3-doctest/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py:229: UnicodeWarning: Unicode equal comparis
on failed to convert both arguments to Unicode - interpreting them as being unequal
if code[q-1] == u'\\':
sage -t --long devel/sage/sage/categories/algebra_modules.py
[9 tests, 0.1 s]
sage -t --long devel/sage/sage/numerical/backends/ppl_backend.pyx
[167 tests, 0.1 s]
sage -t --long devel/sagenb-main/sagenb/misc/misc.py
[36 tests, 0.0 s]
Quick question: how many DocTestWorker
instances are created? Just one? Or N if we set MAKE=make -jN
?
There is a lot of
if output_semaphore is not None:
output_semaphore.acquire()
in the code which could be simplified assuming that output_semaphore
is never None
. Could we ensure that?
In fact, the whole block
try:
if output_semaphore is not None:
output_semaphore.acquire()
do_something()
finally:
if output_semaphore is not None:
output_semaphore.release()
could be simplified to
with output_semaphore:
do_something()
Replying to @roed314:
The main point of
--serial
is that it allows you to debug problems in the doctesting framework without worrying about multiprocessing. :-)
I vote to completely remove the --serial
option since for me the only effect is that it makes the code more complicated. Consider for example:
except KeyboardInterrupt:
if result_pipe is None:
# We're in serial mode
raise
# Otherwise, we've been started by a worker: we'll return the result in the finally block
result = 0, DictAsObject(dict(err='ctlC')), ""
[...]
finally:
if result_pipe is None:
return result
else:
result_pipe.put(result, False)
If I want to fix KeyboardInterrupt
s in doctests, I now have to think about two code paths instead of one.
Description changed:
---
+++
@@ -12,7 +12,7 @@
and does most of the above (plus more which has been moved to followup tickets #12720 and #12722)
-**Apply** [attachment: 12415_framework.patch](https://github.com/sagemath/sage-prod/files/10654698/12415_framework.patch.gz), [attachment: 12415_doctest_fixes.patch](https://github.com/sagemath/sage-prod/files/10654706/12415_doctest_fixes.patch.gz), [attachment: 12415_doc.patch](https://github.com/sagemath/sage-prod/files/10654702/12415_doc.patch.gz)
+**Apply** [attachment: 12415_framework.patch](https://github.com/sagemath/sage-prod/files/10654698/12415_framework.patch.gz), [attachment: 12415_doctest_fixes.patch](https://github.com/sagemath/sage-prod/files/10654706/12415_doctest_fixes.patch.gz), [attachment: 12415_doc.patch](https://github.com/sagemath/sage-prod/files/10654702/12415_doc.patch.gz), [attachment: 12415_review.patch](https://github.com/sagemath/sage-prod/files/10654704/12415_review.patch.gz)
**Apply** [attachment: 12415_script.patch](https://github.com/sagemath/sage-prod/files/10654697/12415_script.patch.gz) to the scripts repo
Changed dependencies from #13147,#13146, #13145, #12723, #12392, #12393, #12395, #12396, #12397, #12381, #12382, #12383, #12384, #11871, #13195, #13121, #13748, #13899, #12719, #5155 to #13147,#13146, #13145, #12723, #12392, #12393, #12395, #12396, #12397, #12381, #12382, #12383, #12384, #11871, #13195, #13121, #13748, #13899, #12719, #5155, #14070
Any clue where this failure comes from?
sage -t --long devel/sagenb-main/sagenb/notebook/worksheet.py
**********************************************************************
File "devel/sagenb-main/sagenb/notebook/worksheet.py", line 197, in sagenb.notebook.worksheet.Worksheet.?
Failed example:
sagenb.notebook.misc.notebook = nb
Exception raised:
Traceback (most recent call last):
File "/release/sage-5.7.beta3-doctest/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 595, in _run
self.execute(example, compiled, test.globs)
File "/release/sage-5.7.beta3-doctest/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 980, in execute
exec compiled in globs
File "<doctest sagenb.notebook.worksheet.Worksheet.?[2]>", line 1, in <module>
sagenb.notebook.misc.notebook = nb
AttributeError: 'module' object has no attribute 'misc'
**********************************************************************
Answering my own question, it seems that N DocTestWorkers
are started with make -j N
.
But I don't understand the design choice of having two levels of multiprocessing: one from the master process to the DocTestWorker
and one from the DocTestWorker
to the DocTestTask
.
I surely need to think more about this, but I propose a redesign with only one level of multiprocessing, completely bypassing the DocTestWorker
class. This would give the benefits of "serial" processing while still allowing parallel doctests.
Changed dependencies from #13147,#13146, #13145, #12723, #12392, #12393, #12395, #12396, #12397, #12381, #12382, #12383, #12384, #11871, #13195, #13121, #13748, #13899, #12719, #5155, #14070 to #13147,#13146, #13145, #12723, #12392, #12393, #12395, #12396, #12397, #12381, #12382, #12383, #12384, #11871, #13195, #13121, #13748, #13899, #12719, #5155, #14070, #14079
I would love to cooperate on this ticket, but I'm having a really hard time to understand everything. As I said before, this absolutely needs some high-level documentation explaining the various classes involved and their inter-relations (in particular, the classes in forker.py
).
Replying to @jdemeyer:
I would love to cooperate on this ticket, but I'm having a really hard time to understand everything. As I said before, this absolutely needs some high-level documentation explaining the various classes involved and their inter-relations (in particular, the classes in
forker.py
).
Hi Jeroen, Sorry for the slow response time: I've had a visitor here in Calgary who's leaving this evening. I've started writing up overall documentation, and will work on it tonight.
In answer to your question about two layers of multiprocessing, the reason is to insulate the controlling process from segfaults and serious failures in the files being tested. If we come up with a design that satisfies that goal and also allows testing with multiple processes through sage -t -p N
I'm happy to change things. I certainly wouldn't call myself an expert on writing multi-process code.
I'll write more this evening, and also review #14079 and #14080.
Replying to @roed314:
In answer to your question about two layers of multiprocessing, the reason is to insulate the controlling process from segfaults and serious failures in the files being tested. If we come up with a design that satisfies that goal and also allows testing with multiple processes through
sage -t -p N
I'm happy to change things.
We obviously need one level of multiprocessing to allow segmentation faults. But my idea would be to skip the first level of multiprocessing and move the code which is currently in DocTestWorker
to the top-level process. I think that would simplify the code substantially.
Replying to @jdemeyer:
Replying to @roed314:
In answer to your question about two layers of multiprocessing, the reason is to insulate the controlling process from segfaults and serious failures in the files being tested. If we come up with a design that satisfies that goal and also allows testing with multiple processes through
sage -t -p N
I'm happy to change things.We obviously need one level of multiprocessing to allow segmentation faults. But my idea would be to skip the first level of multiprocessing and move the code which is currently in
DocTestWorker
to the top-level process. I think that would simplify the code substantially.
How would you run multiple files in parallel then?
Replying to @roed314:
How would you run multiple files in parallel then?
Like I said. Have one master process doing the jobs of the current master process + the current DocTestWorker
. That master process would start child processes for every file to be doctested, check for timeouts and crashes of the child processes.
And use #14079 to manage everything without busy-waiting.
Okay. I have no objections to removing the -serial
option and changing the process management as you suggest. It sounds like you have a good idea of how to rewrite it; how can I be useful? I can certainly review #14080 and #14079 and add overall documentation. What else would be useful?
Replying to @roed314:
how can I be useful?
There are still some doctest failures left. I fixed a few in my reviewer patch (and #14070) but there are some I couldn't immediately fix. There is also the issue with TAB characters in devel/sagenb/sagenb/data
, ideally that directory would be skipped from doctesting.
But try not to change forker.py
too much, as I'm thinking of substantial changes there.
Replying to @jdemeyer:
Replying to @roed314:
how can I be useful?
There are still some doctest failures left. I fixed a few in my reviewer patch (and #14070) but there are some I couldn't immediately fix. There is also the issue with TAB characters in
devel/sagenb/sagenb/data
, ideally that directory would be skipped from doctesting.But try not to change
forker.py
too much, as I'm thinking of substantial changes there.
Sounds good. I'll work on fixing doctest failures.
I also don't like this:
except IOError:
# File doesn't exist
result = 0, DictAsObject(dict(err='file')), ""
You assume that any IOError
comes from the fact that the doctested file doesn't exist. But it could also be a problem instead with the temporary file used for output (no space on /tmp
for example). Then the error message "File not found" would be very confusing. I think we need to explicitly check the existence of the doctested file and not handle IOError
specially.
In "normal" mode (non-verbose, non-debug, non-serial), does the DocTestTask
or DocTestRunner
ever print something to the real (unspoofed) stdout/stderr?
Replying to @jdemeyer:
In "normal" mode (non-verbose, non-debug, non-serial), does the
DocTestTask
orDocTestRunner
ever print something to the real (unspoofed) stdout/stderr?
I don't think so.
Replying to @jdemeyer:
I also don't like this:
except IOError: # File doesn't exist result = 0, DictAsObject(dict(err='file')), ""
You assume that any
IOError
comes from the fact that the doctested file doesn't exist. But it could also be a problem instead with the temporary file used for output (no space on/tmp
for example). Then the error message "File not found" would be very confusing. I think we need to explicitly check the existence of the doctested file and not handleIOError
specially.
Fair enough.
I think the word "output" is badly overused (there is the standard output, the file output=os.tmpfile()
, the string output
referring to the contents of the output
file and delayed_output
).
I propose to rename:
delayed_output
to messages
output
(the file) to outtmpfile
(or tmpoutfile
?)And am I right that delayed_output
is empty in normal mode when there are no doctest failures?
Replying to @jdemeyer:
Can I replace "
delayed_output
" by "messages
" because I think the word "output" is overused (there is the standard output, the fileoutfile=os.tmpfile()
anddelayed_output
).And am I right that
delayed_output
is empty in normal mode when there are no doctest failures?
Sure, either "messages
" or "delayed_messages
" is fine. It should be empty in normal mode with no failures.
Edited [comment:181] because I noticed there is output
the file and output
the contents of the file.
Replying to @jdemeyer:
I think the word "output" is badly overused (there is the standard output, the file
output=os.tmpfile()
, the stringoutput
referring to the contents of theoutput
file anddelayed_output
).I propose to rename:
delayed_output
tomessages
output
(the file) toouttmpfile
(ortmpoutfile
?)
I like outtmpfile
.
First version of my updated doctesting framework.
Replying to @jdemeyer:
First version of my updated doctesting framework.
Some comments:
doctest
module, summarize
should take verbose
as the first argument and msgfile
as an optional second argument.sage.ext.pselecter
rather than `sage.ext.pselect, unless you changed #14079.following
in _parallel_dispatch
?Overall, I think the changes are good ones.
Replying to @roed314:
- For compatibility with Python's
doctest
module,summarize
should takeverbose
as the first argument andmsgfile
as an optional second argument.
Fair enough. I never considered this compatibility, so it is quite possible that I introduced more incompatibilities.
- It's
sage.ext.pselecter
rather than `sage.ext.pselect, unless you changed #14079.
Yes, I did change #14079 but didn't realise that I hadn't uploaded the new patch. Anyway, I am looking into the problem on OS X.
- Currently I get a timeout whenever I run tests on a file.
Could be the same OS-dependent problem as the one you reported in #14079.
- Why did you disable the code that sets
following
in_parallel_dispatch
?
Simply because it is work in progress, new tests are needed. My DocTestWorker
has a substantially different function from the old DocTestWorker
, so I cannot recycle the tests.
Updated patch (I haven't looked at the OS X specific issue).
I noticed a change in behaviour and I'm wondering whether it is intentional.
Old doctesting framework outputs like:
sage -t file.py
<messages>
[4.1 s]
New doctesting framework outputs like:
<messages>
sage -t file.py
[1 test, 1 failure, 1.7 s]
I dislike the last way of outputting, especially in --verbose
mode, because there is no mention of what file is being tested. You see:
Running doctests with ID 2013-02-13-12-55-39-de348564.
Doctesting 1 file.
Trying (line 2): cython('sig_on()')
Expecting nothing
ok [1.61 s]
...
"line 2" of which file?...
Replying to @jdemeyer:
I noticed a change in behaviour and I'm wondering whether it is intentional.
Old doctesting framework outputs like:
sage -t file.py <messages> [4.1 s]
New doctesting framework outputs like:
<messages> sage -t file.py [1 test, 1 failure, 1.7 s]
I dislike the last way of outputting, especially in
--verbose
mode, because there is no mention of what file is being tested. You see:Running doctests with ID 2013-02-13-12-55-39-de348564. Doctesting 1 file. Trying (line 2): cython('sig_on()') Expecting nothing ok [1.61 s] ...
"line 2" of which file?...
Adding the number of tests and number of failures was intentional (to which I assume you don't object). The change in order is an artifact of the code that delayed printing. When there are lots of processes running parallel, you can't output
sage -t file.py
before starting testing, since the messages will get mixed up. I agree that the line describing which file is being tested should come at the beginning.
Replying to @roed314:
I agree that the line describing which file is being tested should come at the beginning.
OK, so you agree with
sage -t file.py
<messages>
[1 test, 1 failure, 1.7 s]
I would move the part of report()
which prints the "sage -t ..." header to a new function report_start()
(for example) and then correct the logic in forker.py
to call that function at the appropriate time. Does that sound good?
There are several improvements that would be good to make, including but not limited to:
Robert's code is at
http://code.google.com/p/sagemath-timer/
Apply attachment: 12415_framework.patch, attachment: 12415_doctest_fixes.patch, attachment: 12415_doc.patch, attachment: 12415_review.patch, attachment: 12415_test.patch, attachment: 12415_review_review.patch, attachment: 12415_review3.patch, attachment: 12415_manifest.patch, attachment: 12415_rebase_58.patch
Apply attachment: 12415_script.patch and attachment: 12415_script_review.patch to the scripts repo
Apply attachment: 12415_spkg_bin_sage.patch to the root repo
For follow-up or other doctest-related tickets see #11337.
Depends on #13147 Depends on #13146 Depends on #13145 Depends on #12723 Depends on #12392 Depends on #12393 Depends on #12395 Depends on #12396 Depends on #12397 Depends on #12381 Depends on #12382 Depends on #12383 Depends on #12384 Depends on #11871 Depends on #13195 Depends on #13121 Depends on #13748 Depends on #13899 Depends on #12719 Depends on #5155 Depends on #14070 Depends on #14079 Depends on #14150 Depends on #14158 Depends on #14182 Depends on #14184 Depends on #14054 Depends on #14063 Depends on #13605 Depends on #14111 Depends on #14254 Depends on #14242 Depends on #14253
CC: @kini @ohanar @jhpalmieri
Component: doctest framework
Author: David Roe, Robert Bradshaw, Jeroen Demeyer
Reviewer: Jeroen Demeyer, David Roe
Merged: sage-5.9.beta0
Issue created by migration from https://trac.sagemath.org/ticket/12415