materialsproject / fireworks

The Fireworks Workflow Management Repo.
https://materialsproject.github.io/fireworks
Other
351 stars 184 forks source link

Better CLI errors on missing config files #480

Closed janosh closed 2 years ago

janosh commented 2 years ago

At the moment, when not setting FW_config.yaml or giving a non-existant path to a config file like -l path/to/nowhere/my_launchpad.yaml, the CLIs print a rather unhelpful error message:

Traceback (most recent call last):
  File "/rds/user/jr769/hpc-work/.venv/py39/bin/qlaunch", line 8, in <module>
    sys.exit(qlaunch())
  File "/rds/user/jr769/hpc-work/.venv/py39/lib/python3.9/site-packages/fireworks/scripts/qlaunch_run.py", line 326, in qlaunch
    do_launch(args)
  File "/rds/user/jr769/hpc-work/.venv/py39/lib/python3.9/site-packages/fireworks/scripts/qlaunch_run.py", line 76, in do_launch
    queueadapter = load_object_from_file(args.queueadapter_file)
  File "/rds/user/jr769/hpc-work/.venv/py39/lib/python3.9/site-packages/fireworks/utilities/fw_serializers.py", line 385, in load_object_from_file
    f_format = filename.split(".")[-1]
AttributeError: 'NoneType' object has no attribute 'split'

This PR changes the behavior to

FileNotFoundError: launchpad_file 'path/to/nowhere/my_launchpad.yaml' does not exist!

I also added tests for all 4 CLIs to ensure this behavior doesn't break in the future.

janosh commented 2 years ago

Still needs tests for the new lpad config file validation and manual testing in a real world setting.

janosh commented 2 years ago

Alright, we're getting there. I think these changes will make the CLIs more user friendly when not passing in required config files or providing non-existing file paths or trying to connect to a database on a host:port where none is running.

@arosen93 I've done some local testing and everything seems to work. But after the issues in #476 and considering I made quite a few changes, I'd like to be more thorough with testing this time. My own usage of FW might be quite limited compared to yours. If you have a few minutes, would you be able to install these changes and see if you encounter any problems running your usual commands?

pip uninstall fireworks -y
pip install https://github.com/janosh/fireworks/zipball/better-cli-errors-on-missing-config-files
Andrew-S-Rosen commented 2 years ago

@janosh, sure I'd be happy to. Will aim to do so at the end of the day, CA time. From our prior conversation, it also sounds like if FW_CONFIG_DIR is wrong, the improved error message should be present? That's a scenario where I can see it being particularly helpful, so I'll confirm.

janosh commented 2 years ago

Thanks!

From our prior conversation, it also sounds like if FW_CONFIG_DIR is wrong, the improved error message should be present?

Yes.

Andrew-S-Rosen commented 2 years ago

Things seem to work okay for me based on a few common test cases. I will admit, in practice I have never used -l, although I do swap config directories all the time with -c. That's probably why I never (accidentally) ran into the "unhelpful" CLI error.

janosh commented 2 years ago

@arosen93 Thanks for checking! I prefer to use the -c flag as well. But then I'm curious, if you've ever misspecified the path to your config directory, you should have gotten this other not very helpful error message which annoyingly only appears after 30 seconds of waiting, right?

pymongo.errors.ServerSelectionTimeoutError: localhost:27017: [Errno 61] Connection refused, Timeout: 30s, Topology Description: <TopologyDescription id: 62022707e3e544f54d361c5d, topology_type: Unknown, servers: [<ServerDescription ('localhost', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:27017: [Errno 61] Connection refused')>]>

With this change the error appears after half a second and says

ValueError: FireWorks was not able to connect to MongoDB at localhost:27017. Is the server running? The database file specified was None. Type "lpad init" if you would like to set up a file that specifies location and credentials of your Mongo database (otherwise use default localhost configuration).

I cut the wait time to half a second since Fireworks, if it can't find a launchpad file in the -c, defaults to creating a Launchpad with default parameters, i.e. one that tries to connect to a local Mongo DB in which case the connection should either be very quick to establish or not possible at all if none is running.

Incidentally, this discussion caused me to find a small error in the code:

- os.path.exists(os.path.join(args.config_dir, f"my_{file_name}.yaml"))
+ os.path.exists(os.path.join(args.config_dir, f"my_{config_file}.yaml"))

@computron If the change in fireworks/fw_config.py I highlighted above is acceptable, I think this is ready for review.