Open A-Fisk opened 5 years ago
Yeah, totally agree.
Unfortunately, I don't have much time this week as I am presenting at our lab meeting on Wednesday and I am off for my holiday Thursday morning for two weeks. Hence: do you want to make a pull request? Otherwise, I will get to it when I am back Aug 2.
I have started writing some skeleton code for a decorator so the body of the relevant functions can ideally remain unchanged. If you look at the latest commit, I have added to data I/O functions in example_pipeline/data_io.py
the following decorator:
def _handle_file_path(func):
def func_wrapper(file_path, *args, **kwargs):
pathlib_object = _get_pathlib_object(file_path)
output = func(pathlib_object, *args, **kwargs)
return output
func_wrapper.__name__ = func.__name__
func_wrapper.__doc__ = func.__doc__
return func_wrapper
def _get_pathlib_object(file_path):
return file_path
_get_pathlib_object
is obviously a dud (it just returns the string without doing anything to it).
I would propose that _get_pathlib_object
I can have a look at putting something together later this week. Hopefully it won't be beyond me, I'll follow your steps and see how that goes.
This doesn't have any tests yet to see if it's working does it?
I have a couple of test scripts locally, but they are still in a mess and in no form to be published, and still need a lot of work to be made into a proper test suite. Most I/O functions are still missing, mostly because I have only been getting files from your lab and didn't really know what exceptions to test against.
In the meantime, example_pipeline/02_test_state_annotation.py
is an ok poor man's integration test as the coverage is actually pretty good (apart from the other scripts in the example_pipeline
). If 01_preprocess_signals.py
passes and 02_test_state_annotation.py
still gives the same accuracy values, everything is probably ok.
On a different note, scratch point 2 in my objectives for _get_pathlib_object
.
It will be counter-productive for functions that write data (as the path probably shouldn't exist) and for functions that read data, it would be better if those functions threw the error and not the generic wrapper.
So all that needs to go into the _get_pathlib_object
is:
The whole function should probably be renamed to _sanitize_file_path
, coming to think of it.
Re: testing. I've come across a few problems running my scripts because I've processed them slightly differently to other people. Not sure where the best place to record them is, open another issue for them?
Re: objectives. checking if path exists. How about checking if the path exists, if it doesn't at least check the parent exists so it can still write there?
Not sure where the best place to record them is, open another issue for them?
Yes, please.
How about checking if the path exists, if it doesn't at least check the parent exists so it can still write there?
Great idea. I should have thought of that.
The file paths read in from the spreadsheets are kept in strings and we keep running into problems of getting backslashes/forwardslashes as we keep changing from unix to windows.
Worth updating to read them in as pathlib objects?