Closed TheCedarPrince closed 10 months ago
Here is what we discussed in our call:
# Existing dispatch of working with person ids
GetPatientGender([1, 2, 3], conn)
# Issue idea
using DataFrames
df = DataFrame([1, 2, 3], cols = [:person_id])
GetPatientGender(df, conn)
# Dispatch function "knows" what column it is expecting to see from the DataFrame
function GetPatientGender(df::DataFrame, conn; ...)
ids = df.person_id
conn = conn
# DataFrame with two columns: person_id, gender_concept_id
# This is the a new DataFrame returned from the dispatch call
new_df = GetPatientGender(ids, conn)
# With this part, try this out for one or two functions
df = outerjoin(df, new_df, on = [:person_id => :person_id])
# DataFrame with two columns: person_id, gender_concept_id
# This is the original DataFrame that was passed into the function
# but has been updated (mutated) by the function itself
return df
end
Let me know if you have any questions -- thanks!
Closed by #54
This has been a feature I have been thinking about for a while wanting. I think all species of functions within this package should be able to accept a DataFrame and, depending on the function, know how to index that DataFrame to automatically retrieve information required. Additionally, functions maybe should automatically join results onto a passed in DataFrame.
The reason for these changes is that I often want to use the pattern:
or even
To do very quick, rapid analyses and to re-use analyses over and over again clearly and explicitly. Not sure how much of the API should change as a result of this fix but would lend itself much better to composed functions and composition.