materialsproject / fireworks

The Fireworks Workflow Management Repo.
https://materialsproject.github.io/fireworks
Other
361 stars 185 forks source link

GetFilesByQueryTask suggestion #332

Closed jotelha closed 5 years ago

jotelha commented 5 years ago

additionally allow sorting of queried files by arbitrary metadata

coveralls commented 5 years ago

Coverage Status

Coverage decreased (-0.2%) to 60.543% when pulling 823a7b716d747104a18c7d87bfbf6df6c3b84279 on jotelha:getFileByQuery into df8374bc3358a826eaa258de333ff6a46d4f54fa on materialsproject:master.

jotelha commented 5 years ago

This is a suggestion to allow querying files by metadata, originally I had a little nested function in there for converting nested dicts to proper mongo queries, but codeclimate did not like it. I have been using Fireworks now for over a year during my PhD and would like to gradually make a few suggestions via pull requests. However, I have never really done such before, thus I will have to figure out step by step what these different checks are about and how to do serve them properly.

Best regards,

Johannes

jotelha commented 5 years ago

The circleci tests fail with these error messages:

py27 run-test-pre: PYTHONHASHSEED='1566239069'
py27 run-test: commands[0] | nosetests
......................../home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
./home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
./home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
..Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/launchpad.py", line 1690, in recover_offline
    offline_data = loadfn(offline_loc)
  File "/home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/monty/serialization.py", line 72, in loadfn
    with zopen(fn) as fp:
  File "/home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/monty/io.py", line 72, in zopen
    return io.open(filename, *args, **kwargs)
IOError: [Errno 2] No such file or directory: '/home/circleci/fireworks/fireworks/core/tests/launcher_offline/FW_offline.json'
.Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/circleci/fireworks/fireworks/core/tests/tasks.py", line 22, in run_task
    raise SerializableException(self['exc_details'])
SerializableException
.Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/circleci/fireworks/fireworks/core/tests/tasks.py", line 22, in run_task
    raise SerializableException(self['exc_details'])
SerializableException
.Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/circleci/fireworks/fireworks/core/tests/tasks.py", line 22, in run_task
    raise SerializableException(self['exc_details'])
SerializableException
.Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/circleci/fireworks/fireworks/core/tests/tasks.py", line 22, in run_task
    raise SerializableException(self['exc_details'])
SerializableException
100%|██████████| 100/100 [00:00<00:00, 14872.36it/s]
..../home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
./home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/circleci/fireworks/fireworks/core/tests/tasks.py", line 75, in run_task
    raise ValueError('Testing; this error is normal.')
ValueError: Testing; this error is normal.
......../home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
/home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
.Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 357, in run
    lp.complete_launch(launch_id, m_action, final_state)
  File "/home/circleci/fireworks/fireworks/core/launchpad.py", line 1318, in complete_launch
    m_launch.to_db_dict(), upsert=True)
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 141, in _decorator
    m_dict = func(self, *args, **kwargs)
  File "/home/circleci/fireworks/fireworks/core/firework.py", line 544, in to_db_dict
    m_d = self.to_dict()
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 142, in _decorator
    m_dict = recursive_dict(m_dict)
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 83, in recursive_dict
    for k, v in obj.items()}
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 82, in <dictcomp>
    return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 76, in recursive_dict
    return recursive_dict(obj.as_dict(), preserve_unicode)
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 212, in as_dict
    return self.to_dict()
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 142, in _decorator
    m_dict = recursive_dict(m_dict)
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 83, in recursive_dict
    for k, v in obj.items()}
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 82, in <dictcomp>
    return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 86, in recursive_dict
    return [recursive_dict(v, preserve_unicode) for v in obj]
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 76, in recursive_dict
    return recursive_dict(obj.as_dict(), preserve_unicode)
  File "/home/circleci/fireworks/fireworks/utilities/fw_serializers.py", line 212, in as_dict
    return self.to_dict()
  File "/home/circleci/fireworks/fireworks/core/tests/tasks.py", line 39, in to_dict
    raise RuntimeError("to_dict error")
RuntimeError: to_dict error
.Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/circleci/fireworks/fireworks/core/tests/tasks.py", line 22, in run_task
    raise SerializableException(self['exc_details'])
SerializableException
..cat: 4: No such file or directory
Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/circleci/fireworks/fireworks/user_objects/firetasks/script_task.py", line 35, in run_task
    return self._run_task_internal(fw_spec, stdin)
  File "/home/circleci/fireworks/fireworks/user_objects/firetasks/script_task.py", line 88, in _run_task_internal
    raise RuntimeError('ScriptTask fizzled! Return code: {}'.format(returncodes))
RuntimeError: ScriptTask fizzled! Return code: [1]
./home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
.........................../home/circleci/fireworks/fireworks/queue/queue_adapter.py:143: UserWarning: Key hello has been specified in qadapter but it is not present in template, please check template (/home/circleci/fireworks/fireworks/user_objects/queue_adapters/PBS_template.txt) for supported keys.
  .format(subs_key, self.template_file))
100%|██████████| 1/1 [00:00<00:00, 134.58it/s]

0it [00:00, ?it/s]
........this is dummy job 1
..........F..This is the first FireWork
.Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/home/circleci/fireworks/fireworks/user_objects/firetasks/script_task.py", line 189, in run_task
    output = func(*args, **kwargs)
  File "/home/circleci/fireworks/fireworks/tests/mongo_tests.py", line 47, in throw_error
    raise ValueError(msg)
ValueError: Testing; this error is normal.
...Testing job info
.this is intermediate job 1
this is intermediate job 2
this is intermediate job 3
DONE
......Testing preserve FWorker
.../home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
./home/circleci/fireworks/.tox/py27/local/lib/python2.7/site-packages/pymongo/topology.py:150: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#is-pymongo-fork-safe
  "MongoClient opened before fork. Create MongoClient only "
.................
======================================================================
FAIL: test_dupefinder (fireworks.tests.mongo_tests.MongoTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/circleci/fireworks/fireworks/tests/mongo_tests.py", line 505, in test_dupefinder
    self.assertEqual(self.lp.launches.count(), 1)
AssertionError: 2 != 1
-------------------- >> begin captured stdout << ---------------------
TOO MANY LAUNCHES FOUND!

The conclusive summary says

ERROR:   py27: commands failed
  py36: commands succeeded
Exited with code 1

I don't see how such errors would be related to the simple GetFilesByQueryTask introduced in this pull request. Probably need some help to understand this.

computron commented 5 years ago

Thanks for contributing!

The changes look good. The tests were failing due to some one-off errors, rerunning the tests "fixed" it. Merging into FWS v1.9.4