adbailey1 / daic_woz

7 stars 4 forks source link

Sincere learner #1

Open pengsiyu2019102942 opened 4 years ago

pengsiyu2019102942 commented 4 years ago

Hello, I recently read the code you wrote. I want to ask you a question. The data set I downloaded is a bit different from yours. I am missing the full_train_split_Depression_AVEC2017.csv file and the complete_Depression_AVEC2017.csv file. I don’t know if you are willing to share yours. Thank you for the link to the network disk of the data set! I wish you all the best!

adbailey1 commented 4 years ago

Hi! Thanks for checking out the repo and for finding this bug. I actually fixed this issue in one of my other repos LINK, before I get round to updating and closing this issue, the following code should be enough:

if not os.path.exists(config.COMP_DATASET_PATH):
    if not os.path.exists(config.FULL_TRAIN_SPLIT_PATH):
        utilities.merge_csv(config.TRAIN_SPLIT_PATH, config.DEV_SPLIT_PATH, config.FULL_TRAIN_SPLIT_PATH)
    utilities.merge_csv(config.FULL_TRAIN_SPLIT_PATH, config.TEST_SPLIT_PATH, config.COMP_DATASET_PATH)

where config.FULL_TRAIN_SPLIT_PATH and config.COMP_DATASET_PATH are:

FULL_TRAIN_SPLIT_PATH = os.path.join(DATASET, 'full_train_split_Depression_AVEC2017.csv')
COMP_DATASET_PATH = os.path.join(DATASET, 'complete_Depression_AVEC2017.csv')

and where utilities.merge_csv() is:


def merge_csv(path, path2, filename):

    a = pd.read_csv(path)
    b = pd.read_csv(path2)

    columnsa = list(a)
    columnsb = list(b)
    # The test split doesn't have the same columns as the train/dev
    # Create these extra columns and fill them with -1
    if len(columnsa) > len(columnsb):
        difference = len(columnsa) - len(columnsb)
        names = columnsa[-difference:]
        h, _ = b.shape
        zeros = [-1] * h
        for i in range(difference):
            b[names[i]] = zeros

    columnsb = list(b)
    # This checks that the column headers are the same in the two CSV files
    for i in range(len(columnsa)):
        # If headers are different, re-name
        if columnsa[i] != columnsb[i]:
            b = b.rename(columns={columnsb[i]: columnsa[i]})

    # Create a single dataframe from the 2 CSV files and save
    dataframes = [a, b]
    c = pd.concat(dataframes)
    c = c.sort_values(by=['Participant_ID'])

    c.to_csv(filename, index=False)```

If this doesn't work or there are any other problems let me know. I'll hopefully get around to updating this during the week. A cleaner and simpler repro than this one is [HERE](https://github.com/adbailey1/DepAudioNet), again I need to clean up the code but this is the most recent version I have been working with for experimentation. 
pengsiyu2019102942 commented 4 years ago

Thank you for your answer, and would like to ask you, did you write a reference paper for this project? Looking forward to your reply, thank you again!

adbailey1 commented 4 years ago

Thanks for asking, I will get back to you asap. If you have any other questions in the meantime please let me know.

pengsiyu2019102942 commented 4 years ago

Hello, can you send your experiment results and experiment steps to my mailbox? The email is: pengsy098@nenu.edu.cn,looking forward to your answer.

adbailey1 commented 4 years ago

Hi, I am still working on a paper you can reference but I should have this done by tomorrow.

With regards to setup and results, I need to mark this repo as obsolete really. It's a good place to start from as I built in many options, hyperparameters and models etc, however I scaled it down and made a reproduction of DepAudioNet, which is what I have based everything on anyway.

If you check out https://github.com/adbailey1/DepAudioNet_reproduction you'll find my reproduction of DepAudioNet along with setup and results. It should also be an easier repo to work with as there is less "stuff" going on.

The last time I used this current repo was around 6-7 months ago, so please feel free to mess around with it if you want but again, I need to make this as obsolete and I would recommend you move over to my DepAudioNet_reproduction.