Open TimothyWillard opened 3 months ago
I'll write some documentation! Thanks for looping me in.
List of functions/classes to document/test:
List of functions/classes to document:
write_df
read_df
command_safe_run
add_method
search_and_import_plugins_class
profile
as_list
Timer
ISO8601Date
as_date
as_evaled_expression
get_truncated_normal
get_log_normal
as_random_distribution
list_filenames
rolling_mean_pad
print_disk_diagnosis
create_resume_out_filename
create_resume_input_filename
get_filetype_for_resume
create_resume_file_names_map
download_file_from_s3
move_file_at_local
I'll start working at the top half!
Sorry I lied....I think I'll just go ahead and do all of them @TimothyWillard. I don't have much else to do this afternoon
A couple questions for @jcblemai before I push changes.
Tagging @fang19911030 because he wrote these functions :)
These are string arguments :)
Hi @emprzy , as @jcblemai said, these are string arguments, do you ask the meaning of these two arguments in the context of model execution?
in the function 'create_resume_out_filename' and 'create_resume_input', what is the arg 'liketype' for?
liketype takes two value: chimeric
or global
, depending on which "chain" of the dual chain mcmc scheme we are using (the one for all subpopulation vs the one that treats them as independent)
in the function 'create_resume_file_names_map', what are the args 'reume_run_index', and 'flepi_prefix' for?
These are part of the filenames, there are used to build the filenames of the file to download from resume.
@fang19911030 yes, I mean in the context of model execution. just not sure what 'liketype' is referencing and want to be able to properly document it
Sorry I had formatting issue in my last message, let me know if you need more info about liketype
@emprzy Do you have plans to follow up GH-260 with a PR for adding/restructuring the unit tests for the functions documented?
@emprzy Do you have plans to follow up GH-260 with a PR for adding/restructuring the unit tests for the functions documented?
I didn't have that on my to do list, but if you want me to do that I could? Why do you ask?
No, if you have other items on your todo list that's fine, I'm just following up to understand what is left to do here. List of functions/classes to be tested/have tests restructured:
write_df
read_df
command_safe_run
add_method
search_and_import_plugins_class
profile
as_list
Timer
ISO8601Date
as_date
as_evaled_expression
get_truncated_normal
get_log_normal
as_random_distribution
list_filenames
rolling_mean_pad
print_disk_diagnosis
create_resume_out_filename
create_resume_input_filename
get_filetype_for_resume
create_resume_file_names_map
download_file_from_s3
move_file_at_local
The
gempyor.utils
module could use some work namely documenting the contained objects with a consistent style, creating a comprehensive test suite that ensures expected behavior, and deprecating some functions/classes or moving objects that shouldn't be housed inutils
to a more appropriate module.Documentation: Looks like there is documentation on some functions (ex.
rolling_mean_pad
ordownload_file_from_s3
), but most do not nor does the module itself. Also need to make documentation consistent, looks like the existing documentation is some kind of variant on the Google style guide so I propose sticking to that. A part of documentation is also expanding the usage of type hints that provide users with documentation on expected inputs.Tests: Looks like there are some existing tests in
tests/utils/test_file_paths.py
,tests/utils/test_utils.py
, andtests/utils/test_utils2.py
. It would be nice to create a consistent style across these tests and expand them to ensure that expected behavior and usage are covered. I also would suggest placing unit tests in a more structured format, namely:Deprecate/Migrate: This is a longer term goal, but these functions should be moved to a more descriptive home. The name
utils
gives me very little info about the contents contained but names likeio
,math
, etc. are informative. Below are some brief thoughts on each function:write_df
/read_df
/download_file_from_s3
/move_file_at_local
: Should go toio
(new) or similar.command_safe_run
: Could go toexec
(new) or similar, but is also probably fine staying inutils
.add_method
: I don't understand this function and it's use quite yet, so no suggestion.search_and_import_plugins_class
: Could go tofile_paths
maybe (not quite the right place either), staying inutils
is fine for now.profile
/Timer
/print_disk_diagnosis
: Should go intoprofile
(new) or similar.as_list
: Should be removed, this is a one liner that is better left to developers to implement on their own per their use case.ISO8601Date
: Could go intodate
or similar if there are other date util functions, or could go into a confuse module.as_date
/as_evaled_expression
/as_random_distribution
: Similar toadd_method
above, not sure why there isn't a class extendingconfuse.ConfigView
instead of directly modifying that object (which is how I'm reading this).get_truncated_normal
/get_log_normal
: Should go intostatistics
.list_filenames
/create_resume_out_filename
/create_resume_input_filename
/get_filetype_for_resume
/create_resume_file_names_map
: Should go intofile_paths
.rolling_mean_pad
: Should go intotime_series
or similar.Bit of a lengthy issue so I don't expect to complete it overnight and I think putting some code/documentation/tests into an initial PR to highlight the value would be good (will do soon). But any initial thoughts @jcblemai or @emprzy?