Closed hepcat72 closed 3 months ago
Also, it's not clear why this is stacked on top of the all_missing_treatments_exception
branch and not merging directly into main
. It seems tangential, but not a blocking issue.
This does seem to remove some calls to
to_dict()
though it's not clear to me why things were DataFrames in the first place and what functionality was gained/lost.
In the later PR(s), you'll see that I'm using pandas' ability to export a dataframe .to_excel()
. There may very well be a better way to do it, but I'm just going with what I know. At this point, I realized that the conversion to dataframe was simply premature. It's easier to work with the dicts.
One downside is that many of the function names are now rather misleading and the data structure is less well defined.
DataFrame
was easy to check for and understand how to use, but the structure now is less easy to ensure it's correct.
I think they are changed in later PRs. I'll check to make sure that they all get renamed. I think they are, but if there are any that weren't, you're right, it's misleading.
Also, it's not clear why this is stacked on top of the
all_missing_treatments_exception
branch and not merging directly intomain
. It seems tangential, but not a blocking issue.
I think what you mean to say is that it should branch off the branch before all_missing_treatments_exception
, since this code depends on that code. However, I do use the treatments exception in later PRs, so it is dependent. This was my way of killing multiple birds with one stone, and the design effort on the existing exceptions aided my handling of the missing sample exceptions.
This does seem to remove some calls to
to_dict()
though it's not clear to me why things were DataFrames in the first place and what functionality was gained/lost.In the later PR(s), you'll see that I'm using pandas' ability to export a dataframe
.to_excel()
. There may very well be a better way to do it, but I'm just going with what I know. At this point, I realized that the conversion to dataframe was simply premature. It's easier to work with the dicts.
Makes total sense 👍🏻
One downside is that many of the function names are now rather misleading and the data structure is less well defined.
DataFrame
was easy to check for and understand how to use, but the structure now is less easy to ensure it's correct.I think they are changed in later PRs. I'll check to make sure that they all get renamed. I think they are, but if there are any that weren't, you're right, it's misleading.
OK, that's good. It would be nice to make those changes here, however, before merging, instead of relying on a separate PR that isn't related to this minor refactor.
Sorry, I forgot and already merged. Too many concurrent conversations, lol. I suspect they're all changed. 🤞🏻
Ok. phew They are changed... with one caveat: I intentionally left get_dataframe_template()
. My thinking was that they are a template for a dataframe, which explains their structure (using dict keys that are sequential integers). If you don't use integers, they don't play well with excel files that are read into dataframes (and dicts).
Summary Change Description
I realized that a number of methods were converting back and forth between dataframes and dicts and I realized that it would all be much easier if I save the production of actual dataframes until after the samples (etc) have been added.
So I converted all instances of
dfs_dict
from adict of pandas dataframes
to adict of pandas' style dicts
, e.g.{"columnname": {0: "row 1 value", 1: "row 2 value"}}
.This prevents the necessity of various methods converting back and forth between dataframes and dicts. It also addresses issues of places where I was temporarily using lists instead of pandas' version of lists (dicts keyed by sequential integeras starting from 0).
Affected Issues/Pull Requests
Review Notes
See comments in-line.
Checklist
This pull request will be merged once the following requirements are met. The author and/or reviewers should uncheck any unmet requirements:
changelog.md
(or no change)