IDSIA / sacred

Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.
MIT License
4.25k stars 383 forks source link

STD output is not captured 40% of the times #224

Open asanakoy opened 7 years ago

asanakoy commented 7 years ago

Sacred version: 0.7.1 Python: 2.7.11 Ubuntu 14.04

2 out of 5 times the field captured_out in the mongoDB entry doesn't contain strings which I print to stdout with print

Minimal snippet to reproduce:

from sacred import Experiment
from sacred.observers import MongoObserver

ex = Experiment('trololo')
ex.observers.append(MongoObserver.create(url=u'129.206.117.36'))

@ex.config
def my_config():
    foo = 42
    bar = 'baz'

@ex.capture
def some_function(a, foo, bar=10):
    print(a, foo, bar)

@ex.automain
def my_main(_run):
    print 'here we go'
    some_function(1)
    for i in xrange(10):
        print i
    print 'end'

After running 5 times:

> db.runs.find({}, {captured_out: 1})
{ "_id" : 18, "captured_out" : "INFO - trololo - Running command 'my_main'\nINFO - trololo - Started run with ID \"18\"\nINFO - trololo - Completed after 0:00:00\nhere we go\n(1, 42, u'baz')\n0\n1\n2\n3\n4\n5\n6\n7\n8\n9\nend\n" }
{ "_id" : 19, "captured_out" : "INFO - trololo - Running command 'my_main'\nINFO - trololo - Started run with ID \"19\"\nINFO - trololo - Completed after 0:00:00\nhere we go\n(1, 42, u'baz')\n0\n1\n2\n3\n4\n5\n6\n7\n8\n9\nend\n" }
{ "_id" : 20, "captured_out" : "INFO - trololo - Running command 'my_main'\nINFO - trololo - Started run with ID \"20\"\nINFO - trololo - Completed after 0:00:00\nhere we go\n(1, 42, u'baz')\n0\n1\n2\n3\n4\n5\n6\n7\n8\n9\nend\n" }
{ "_id" : 21, "captured_out" : "INFO - trololo - Running command 'my_main'\nINFO - trololo - Started run with ID \"21\"\nINFO - trololo - Completed after 0:00:00\n" }
{ "_id" : 22, "captured_out" : "INFO - trololo - Running command 'my_main'\nINFO - trololo - Started run with ID \"22\"\nINFO - trololo - Completed after 0:00:00\n" }

May be related with #101.

asanakoy commented 7 years ago

The easy fix is to add sys.stdout.flush() at the end of the main. Looks like the problem is because of the concurrency. I believe something like sys.stdout.flush(); sys.stderr.flush() must be added somewhere in the guts just before sacred dimps std/stderr outputs the last time before finishing.

Qwlouse commented 7 years ago

This is a recurring non-deterministic issue, and honestly, I have no idea why it happens. I'm thoroughly flushing stdout and stderr here. And in general I've tried to be very careful with flushing and closing all involved streams. If you have any ideas what I have overlooked, please let me know.

Qwlouse commented 6 years ago

Several issues with stdout capturing have been resolved recently, so I'd be surprised if the failure rate is still 40%. But there remains a problem somewhere. (see also #289)