New data fit the old model brought more syllables.

dattalab / keypoint-moseq

https://keypoint-moseq.readthedocs.io

Other

78 stars 28 forks source link

New data fit the old model brought more syllables. #176

Open seesaw1992 opened 1 month ago

seesaw1992 commented 1 month ago

Hi, I applied new data to the existing model, but the results shows more syllables than the previous fitting. My code is:

model = kpms.load_checkpoint(project_dir, model_name)[0]
pca = kpms.load_pca(project_dir)

# load new data (e.g. from deeplabcut)
new_data = new_data  # can be a file, a directory, or a list of files
coordinates, confidences, bodyparts = kpms.load_keypoints(new_data, 'deeplabcut')
data, metadata = kpms.format_data(coordinates, confidences, **config())

# apply saved model to new data
#results = kpms.apply_model(model, pca, data, metadata, project_dir, model_name)
results = kpms.apply_model(model, data, metadata, project_dir, model_name, **config())

# Save results
kpms.save_results_as_csv(results, project_dir, model_name)

calebweinreb commented 1 month ago

What do you mean by "show more syllables"? It is possible that some previously rare syllables are now more common so are becoming included. By the way, we have occasionally noticed some differences between the way syllables are assigned during training vs. applying the model, so I would recommend applying the model to all your data, including the original data.

seesaw1992 commented 1 month ago

What do you mean by "show more syllables"? It is possible that some previously rare syllables are now more common so are becoming included. By the way, we have occasionally noticed some differences between the way syllables are assigned during training vs. applying the model, so I would recommend applying the model to all your data, including the original data.

Thank you for bringing the idea of including all experimental data. I tried this method, unfortunately the dataset is too big ( 100 videos, 30 minutes long, 60 Hz, 5 deeplabcut tracking points). The kernel crushed on colab, and it crushed on HPC ( 128 G CPU2, GPU2).

I'm wondering if the syllable remain the same meaning for the new data applied to the trained model?

calebweinreb commented 1 month ago

For applying the model it doesn't matter if the you run it for all the data at once or for one session at a time. So you could write a loop and crunch the data in batches.