Open JoelPatchitt opened 3 years ago
I had a look at the phenotype data you organised. It is awesome! really well done! :tada:
Two minor points:
sessions = {
"BL": "baseline",
"F": "oneweek", # confirmed by lisa
"3mf": "threemonth",
"FY": "oneyear"
}
Again really well done!
Thank you for the feedback, I will create some comments that help with the script.
The only issue I am having is with the column names that I have purposley left unquoted on lines 41 43 and 45 (please see the replies to your comments above).
They are too ambiguous or appear in other columns, which picks up those columns when putting the new dataframe together. I was asking if there was a solution to solve this ambiguity (possibly using regex)?
First of all, if the meaning is ambiguous, regex won't help. I am not sure why you are so fixated on the idea of using regex.
All you need to do is this for example:
my_list = ["F_stuff", "BL_nostuff", "BL_stuff", "BL_stuffnotthisone"]
collect_stuff = [item for item in my_list if "_stuff" == item[-6:]] # match last 6 character
Regex doesn't make it faster since you still need to iterate through the list. For a simple pattern, regex is overkill. You won't really encounter cases that need regex unless working with more complex data. Here's a solution in regex, not as readable as the original unless you read regex. Regex is not a thing worth spending your time on for the current stage.
import re
my_list = ["F_stuff", "BL_nostuff", "BL_stuff", "BL_stuffnotthisone"]
collect_stuff = [item for item in my_list if re.search(r"_stuff$", item) is not None]
Hey Hao-Ting,
There was an issue with the previous script where session 'F' shared the letter F with columns elsewhere in the data sheet. This caused some basline data to seep into the follow up data. I know there is a better way to fix this issue than the solution that I have introduced as 'parse_phenotype' (probably using re function), but i have tried and tried and ended up throwing together this ham-fisted solution that did the trick.
please see the new script under the name - parse_phenotype