This is the Line of Therapy Algorithm, as described in the paper "Temporal phenotyping by mining healthcare data to derive lines of therapy for cancer" pending submission in the Journal of Biomedical Informatics.
I have a question about the code in line 142 - line 158 in rwToT_LoT_line.py as below. Why is the line_end_date = max(temp_line_end_df['MED_START'])? In the case when the next_drug is not in r_regimen and it starts before the current_drug_end and end after the max date of med_end of drugs in the current line, then the line_end_date will be the end date of the next_drug
```
Check if the next drug is not part of the regimen
elif (next_drug in r_regimen) == False and (has_eligible_drug_addition == False) and (has_eligible_drug_substition == False):
temp_check_new_regimen = df[df['MED_START'] == current_drug_end]#['MED_NAME']
temp_med_name = list(temp_check_new_regimen['MED_NAME'])
all_temp_med_name_are_in_r_regimen = all(elem in r_regimen for elem in temp_med_name)
line_end_date_less_than_flag = (all_temp_med_name_are_in_r_regimen == False)
if (line_end_date_less_than_flag):
temp_line_end_df = df[df['MED_START'] < current_drug_end]
else:
temp_line_end_df = df[df['MED_START'] <= current_drug_end]
line_end_date = max(temp_line_end_df['MED_START'])
line_end_reason = "New line started with new drugs"
line_next_start = next_drug_date
break
I have a question about the code in line 142 - line 158 in rwToT_LoT_line.py as below. Why is the line_end_date = max(temp_line_end_df['MED_START'])? In the case when the next_drug is not in r_regimen and it starts before the current_drug_end and end after the max date of med_end of drugs in the current line, then the line_end_date will be the end date of the next_drug
Check if the next drug is not part of the regimen