seznam / flexp

BSD 3-Clause "New" or "Revised" License
8 stars 3 forks source link

Flexp.setup doesn't clean the experiment directory. #10

Closed alaneckhardt closed 6 years ago

alaneckhardt commented 6 years ago

When creating an experiment, I usually did clean_experiment_folder to get rid of previous results. Now if I do this:

    flexp.setup(EXPS_DIR, DESC, with_date=not SHORT_RUN, default_rights=RWXRWXRWX)
    clean_experiment_folder(flexp.get_file_path(""))

it will remove the log.txt as well, because it is now created by flexp.setup and not flog. Before calling flexp.setup, I don't know what is the experiment dir so I can't clean it.

I suggest adding a switch clean_experiment_folder, defaults to True, which removes everything in the experiment folder and only after that creates the log file.

What do you think?

ofilip commented 6 years ago

Makes sense. But I would prefer some shorter name for the param, defaulting to False (+checking that directory does not exist if False)

alaneckhardt commented 6 years ago

Is there any use case when you would like the content in the same directory to be there? Usually, it will be overwritten by the running experiment anyway, at least that's the way we use flexp.

What name do you suggest for the param?

ofilip commented 6 years ago

No, it is a matter of safety (do not override/delete by default). Maybe override? Bit vague but I hope quite clear in the context.

alaneckhardt commented 6 years ago

Override is imo really too vague and with a different meaning.

I'm still not convinced about the default being False. If it is, I would call it always with =True. I can't imagine any situation, when having the content of the previous experiment is desired.

Anybody comments on that?

tomasprinda commented 6 years ago

I guess there could be raise Exception("Experiment name {} already exists") in case of param=False and folder already exists.

How about override_dir?

default=False is OK for me.

shakukse commented 6 years ago

I vote for name override_dir with default False we sometimes don't want to clean directory just choose wrong name for experiment.

alaneckhardt commented 6 years ago

Ok :) Let's do it with the exception and default False.

tomasprinda commented 6 years ago

I added dir name to exception text.

tomasprinda commented 6 years ago

Test is failing. Is it just me? But functionality is OK.

Problem with test is cleaning data after test. shutil.rmtree(expdir)

(py3.7) tomas@tomas-Inspiron-5379:~/src/flexp/tests$ pytest test_flexp_override.py 
============================================================= test session starts =============================================================
platform linux -- Python 3.7.0, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /media/tomas/data_storage/src/flexp, inifile:
collected 1 item                                                                                                                              

test_flexp_override.py F                                                                                                                [100%]

================================================================== FAILURES ===================================================================
________________________________________________________________ test_override ________________________________________________________________

    def test_override():
        # We have to reset the _eh to make flexp stop complaining about calling setup twice.
        flexp.core._eh = {}
        flexp.setup("tests/data/", "exp01", False, override_dir=False)

        expdir = path.join("tests/data/", "exp01")
        assert path.isdir(expdir), "flexp didn't create experiment dir with override_dir=False"

        # Test that it fails to create the directory, there should be logging file already.
        with pytest.raises(FileExistsError):
            flexp.core._eh = {}
            flexp.setup("tests/data/", "exp01", False, override_dir=False)

        # This should be ok
        flexp.core._eh = {}
        flexp.setup("tests/data/", "exp01", False, override_dir=True)

        # Remove the experiment dir
>       shutil.rmtree(expdir)

test_flexp_override.py:32: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/home/tomas/.pyenv/versions/3.7.0/lib/python3.7/shutil.py:489: in rmtree
    onerror(os.rmdir, path, sys.exc_info())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = 'tests/data/exp01', ignore_errors = False, onerror = <function rmtree.<locals>.onerror at 0x7f7718acde18>

    def rmtree(path, ignore_errors=False, onerror=None):
        """Recursively delete a directory tree.

        If ignore_errors is set, errors are ignored; otherwise, if onerror
        is set, it is called to handle the error with arguments (func,
        path, exc_info) where func is platform and implementation dependent;
        path is the argument to that function that caused it to fail; and
        exc_info is a tuple returned by sys.exc_info().  If ignore_errors
        is false and onerror is None, an exception is raised.

        """
        if ignore_errors:
            def onerror(*args):
                pass
        elif onerror is None:
            def onerror(*args):
                raise
        if _use_fd_functions:
            # While the unsafe rmtree works fine on bytes, the fd based does not.
            if isinstance(path, bytes):
                path = os.fsdecode(path)
            # Note: To guard against symlink races, we use the standard
            # lstat()/open()/fstat() trick.
            try:
                orig_st = os.lstat(path)
            except Exception:
                onerror(os.lstat, path, sys.exc_info())
                return
            try:
                fd = os.open(path, os.O_RDONLY)
            except Exception:
                onerror(os.lstat, path, sys.exc_info())
                return
            try:
                if os.path.samestat(orig_st, os.fstat(fd)):
                    _rmtree_safe_fd(fd, path, onerror)
                    try:
>                       os.rmdir(path)
E                       OSError: [Errno 39] Directory not empty: 'tests/data/exp01'

/home/tomas/.pyenv/versions/3.7.0/lib/python3.7/shutil.py:487: OSError
------------------------------------------------------------ Captured stderr call -------------------------------------------------------------
2018-10-05 20:06:22,761 - flexp.flexp.core - INFO - following files have been saved to experiment folder: /home/tomas/.pyenv/versions/py3.7/bin/pytest
-------------------------------------------------------------- Captured log call --------------------------------------------------------------
core.py                     82 WARNING  could not add group+write to experiment root dir tests/data/
core.py                    192 INFO     following files have been saved to experiment folder: /home/tomas/.pyenv/versions/py3.7/bin/pytest
========================================================== 1 failed in 0.08 seconds ===========================================================
alaneckhardt commented 6 years ago

I just tried the tests on a clean pyenv but it works for me. Python 3.6.6. This is the output:

$ pytest tests/
================================================================================= test session starts ==================================================================================
platform linux -- Python 3.6.6, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /home/alan/research/flexp, inifile:
collected 20 items                                                                                                                                                                     

tests/test_cache.py ...                                                                                                                                                          [ 15%]
tests/test_cache_attrs.py .                                                                                                                                                      [ 20%]
tests/test_chain.py .........                                                                                                                                                    [ 65%]
tests/test_flexp.py ..                                                                                                                                                           [ 75%]
tests/test_flexp_override.py .                                                                                                                                                   [ 80%]
tests/test_inspect.py ..                                                                                                                                                         [ 90%]
tests/test_parallel.py .                                                                                                                                                         [ 95%]
tests/test_utils.py .                                                                                                                                                            [100%]

=================================================================================== warnings summary ===================================================================================
/home/alan/research/flexp/tests/test_cache_attrs.py:10: PytestWarning: cannot collect test class 'TestModule' because it has a __init__ constructor
  class TestModule:

-- Docs: https://docs.pytest.org/en/latest/warnings.html
======================================================================== 20 passed, 1 warnings in 0.33 seconds =========================================================================
alaneckhardt commented 6 years ago

Aha! In Python 3.7, the previous test test_working fails:

$ pytest tests/
================================================================================= test session starts ==================================================================================
platform linux -- Python 3.7.0, pytest-3.8.2, py-1.6.0, pluggy-0.7.1
rootdir: /home/alan/tunnels/goedel/research/flexp, inifile:
collected 20 items                                                                                                                                                                     

tests/test_cache.py ...                                                                                                                                                          [ 15%]
tests/test_cache_attrs.py .                                                                                                                                                      [ 20%]
tests/test_chain.py .........                                                                                                                                                    [ 65%]
tests/test_flexp.py F.                                                                                                                                                           [ 75%]
tests/test_flexp_override.py F                                                                                                                                                   [ 80%]
tests/test_inspect.py ..                                                                                                                                                         [ 90%]
tests/test_parallel.py .                                                                                                                                                         [ 95%]
tests/test_utils.py .                                                                                                                                                            [100%]

======================================================================================= FAILURES =======================================================================================
_____________________________________________________________________________________ test_working _____________________________________________________________________________________

    def test_working():
        flexp.setup("tests/data/", "exp01", False)

        expdir = path.join("tests/data/", "exp01")
        assert path.isdir(expdir), "flexp didn't create experiment dir"

        with io.open("tests/testfile.txt", "wt") as x:
            x.write(u"hello")

        flexp.backup_files(["tests/testfile.txt"])
        assert path.exists(path.join("tests/data/", "exp01", "testfile.txt"))

        # forbid flexp creating metadata on exit
        if hasattr(atexit, "unregister"):
            getattr(atexit, "unregister")(flexp.core._eh["experiment"]._write_metadata)

        os.unlink("tests/testfile.txt")
        os.unlink(path.join(expdir, "testfile.txt"))
        if not hasattr(atexit, "unregister") and path.exists(
                path.join(expdir, "flexp_info.txt")):
            os.unlink(path.join(expdir, "flexp_info.txt"))
>       shutil.rmtree(expdir)

tests/test_flexp.py:33: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/alan/miniconda3/envs/test_flexp/lib/python3.7/shutil.py:489: in rmtree
    onerror(os.rmdir, path, sys.exc_info())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

path = 'tests/data/exp01', ignore_errors = False, onerror = <function rmtree.<locals>.onerror at 0x7f301f2d1f28>

    def rmtree(path, ignore_errors=False, onerror=None):
        """Recursively delete a directory tree.

        If ignore_errors is set, errors are ignored; otherwise, if onerror
        is set, it is called to handle the error with arguments (func,
        path, exc_info) where func is platform and implementation dependent;
        path is the argument to that function that caused it to fail; and
        exc_info is a tuple returned by sys.exc_info().  If ignore_errors
        is false and onerror is None, an exception is raised.

        """
        if ignore_errors:
            def onerror(*args):
                pass
        elif onerror is None:
            def onerror(*args):
                raise
        if _use_fd_functions:
            # While the unsafe rmtree works fine on bytes, the fd based does not.
            if isinstance(path, bytes):
                path = os.fsdecode(path)
            # Note: To guard against symlink races, we use the standard
            # lstat()/open()/fstat() trick.
            try:
                orig_st = os.lstat(path)
            except Exception:
                onerror(os.lstat, path, sys.exc_info())
                return
            try:
                fd = os.open(path, os.O_RDONLY)
            except Exception:
                onerror(os.lstat, path, sys.exc_info())
                return
            try:
                if os.path.samestat(orig_st, os.fstat(fd)):
                    _rmtree_safe_fd(fd, path, onerror)
                    try:
>                       os.rmdir(path)
E                       OSError: [Errno 39] Directory not empty: 'tests/data/exp01'

/home/alan/miniconda3/envs/test_flexp/lib/python3.7/shutil.py:487: OSError
--------------------------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------------------------
2018-10-09 09:04:12,284 - flexp.flexp.core - INFO - following files have been saved to experiment folder: tests/testfile.txt
---------------------------------------------------------------------------------- Captured log call -----------------------------------------------------------------------------------
core.py                    192 INFO     following files have been saved to experiment folder: tests/testfile.txt
____________________________________________________________________________________ test_override _____________________________________________________________________________________

    def test_override():
        # We have to reset the _eh to make flexp stop complaining about calling setup twice.
        flexp.core._eh = {}
>       flexp.setup("tests/data/", "exp01", False, override_dir=False)

tests/test_flexp_override.py:17: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
flexp/flexp/core.py:273: in setup
    clean_experiment_dir(override_dir)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

override_dir = False

    def clean_experiment_dir(override_dir):
        """
        Cleans experiment folder.
        :param bool override_dir: If true, clean the folder, if false, check it's empty or doesn't exists
        :return:
        """
        experiment_dir = get_file_path("")
        # Check experiment_dir is empty or doesn't exists
        if not override_dir:
            if not os.path.exists(experiment_dir) or not os.listdir(experiment_dir):
                return
            else:
>               raise FileExistsError("Directory {} exists and is not empty! Use override_dir=True to clean the directory.")
E               FileExistsError: Directory {} exists and is not empty! Use override_dir=True to clean the directory.

flexp/flexp/core.py:238: FileExistsError
=================================================================================== warnings summary ===================================================================================
alaneckhardt commented 6 years ago

It looks like the logging creates files with names like this: tests/data/exp01/.fuse_hidden0000226d00001862 with content 2018-10-09 09:11:51,649 - flexp.flexp.core - INFO - following files have been saved to experiment folder: tests/testfile.txt If the file is deleted, it gets created again with different filename.

Is there any way to stop the experiment and stop the logging?

alaneckhardt commented 6 years ago

I modified disable() function to stop all logging. @tomasprinda , could you please check if the test is still failing for you? It didn't fail for me, so I can't test it.

tomasprinda commented 6 years ago

Still failing but I know the problem.

When flexp.setup() is called log file_handler is created here which opens the file log.txt and never closes the file.

In test_flexp_override.py, flexp.setup() is called twice, so two file handlers are open. The reason why shutil.rmtree(expdir) doesn't let me delete folder are the open files.

tomasprinda commented 6 years ago

Fixed by adding method for closing handlers and calling in _setup_logging(). Fixed also in test_flexp.py

Test is passing for me now.

Please check if works for you and solution is OK.

alaneckhardt commented 6 years ago

Works, tests are passing for me as well. Sorry for the delay :(