Merck / Line-of-Therapy-Algorithm

This is the Line of Therapy Algorithm, as described in the paper "Temporal phenotyping by mining healthcare data to derive lines of therapy for cancer" pending submission in the Journal of Biomedical Informatics.
GNU General Public License v3.0
26 stars 14 forks source link

next drug not part of the regimen #4

Closed jwu19 closed 2 months ago

jwu19 commented 2 months ago

I have a question about the code in line 142 - line 158 in rwToT_LoT_line.py as below. Why is the line_end_date = max(temp_line_end_df['MED_START'])? In the case when the next_drug is not in r_regimen and it starts before the current_drug_end and end after the max date of med_end of drugs in the current line, then the line_end_date will be the end date of the next_drug

       ```

Check if the next drug is not part of the regimen

        elif (next_drug in r_regimen) == False and (has_eligible_drug_addition == False) and (has_eligible_drug_substition == False):
            temp_check_new_regimen = df[df['MED_START'] == current_drug_end]#['MED_NAME']
            temp_med_name = list(temp_check_new_regimen['MED_NAME'])
            all_temp_med_name_are_in_r_regimen =  all(elem in r_regimen  for elem in temp_med_name)
            line_end_date_less_than_flag = (all_temp_med_name_are_in_r_regimen == False)

            if (line_end_date_less_than_flag):
                temp_line_end_df = df[df['MED_START'] < current_drug_end]
            else:
                temp_line_end_df = df[df['MED_START'] <= current_drug_end]

            line_end_date = max(temp_line_end_df['MED_START'])
            line_end_reason = "New line started with new drugs"
            line_next_start = next_drug_date

            break