uw-ssec / MAWpy

Mobility Analysis Workflow in Python
https://mawpy.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1 stars 2 forks source link

perf: update Incremental Clustering to use pandas dataframe for data processing. #26

Open anujsinha3 opened 2 weeks ago

anujsinha3 commented 2 weeks ago

Currently, the input file format for incremental clustering is a CSV file.

Instead of reading this data as a pandas data frame, each line in the CSV file is read separately, and the columns are accessed by integer-based indexing.

This leads to non-vectorizable code and additional nested loops that impact performance. Additionally, this makes the code cluttered and difficult to comprehend.

The task is:

anujsinha3 commented 1 week ago

first pass complete