ctmm-initiative / ctmmweb

Web app for analyzing animal tracking data, built upon ctmm R package
http://biology.umd.edu/movement.html
GNU General Public License v3.0
32 stars 21 forks source link

Update data after time subsetting #22

Closed xhdong-umd closed 7 years ago

xhdong-umd commented 7 years ago

After user finished time subsetting, I plan to update data set with the time range list:

@chfleming @jmcalabrese Will this work with your ideal workflow for time subsetting?

chfleming commented 7 years ago

Is there a way to highlight (maybe with change of icon color) that later stage analyses now need to be updated to reflect the new subsetting?

xhdong-umd commented 7 years ago

For the notification/reminder, I plan to try two options and see which way is better.

Now for the logic of applying time subsetting, I assumed it's possible to use several different time ranges together for same individual. Then there could be several disjoint group of points in time and location.

Because my code now calculate median center for each sampling time range, this will not cause problem for distance outliers.

For speed calculation, the start and end points of each time range may not have problem either. There could be a long distance jump between group, but the time used is also long. Though I'm not totally sure if this can cause problem.

xhdong-umd commented 7 years ago

I've almost finished this feature, then I realized there could be another approach.

@chfleming @jmcalabrese How about we make any time subset (could be multiple time segments for one individual) into a "new individual" named with a postfix, like Cilla time subset become Cilla_subset_1, and add it to the dataset?

The previous design need to maintain a subset of original data, and user need to be reminded that he is working on the subset. We can use various reminders, notifications about this, but a separate subset individual can use all the existing features without any burden of maintenance.

You can also have multiple time subsets and checking them separately or together. We can also add a button to delete selected individuals in visualization page, so user can also remove the subsets or any individual if needed.

jmcalabrese commented 7 years ago

I like the "new individual" idea. I think that is cleaner than repeatedly reminding the user that they are working with a subset of the data.

chfleming commented 7 years ago

I agree

xhdong-umd commented 7 years ago

@chfleming I assume as.telemetry always sort the individuals by identity name, right? I need to sort the new individual after inserted it into the original data.

There are so many things need to consider to insert an artificial dataset. This approach may be easier for user, but is actually much more complex for developer.

xhdong-umd commented 7 years ago

I finally finished this feature and updated the repo. It's quite complex because adding a new data set into an existing structure need to maintain and update a lot of things.

When you click the generate subset button, new individual will be generated and move back to visualization page, with a notification.

Subset on Cilla will benamed like Cilla_subset_1. Another subset on Cilla will be Cilla_subset_2. You can also subset Cilla_subset_1 to Cilla_subset_1_subset_1.

The subset suffix is a little bit long, but using number only could get confused with original individual name.

chfleming commented 7 years ago

I think as.telemetry leaves the order of animals the same as in the CSV file. The times do get sorted, though, but with a warning because that usually only happens when the time format is misspecified.

xhdong-umd commented 7 years ago

I decided to always sort the list so the identity is sorted. Maybe this behavior is different from ctmm but that should not be a problem.

The feature is complete now, after I spent quite some time changed how outlier pages works, also fixed some hidden bug when outlier and time subsetting both exist.