Closed giuliowaitforitdavide closed 1 year ago
Yes, you are right. We should come up with a name for the new file/class. I think popularity
could be misleading, since InteractionSegmentation
and ActivitySegmentation
are also measuring "popularity", somehow. What about renaming the class InteractionPercentage
and creating a file named percentages.py
?
Maybe we could:
test_pattern
function in utils from the matrix function, and call the file matrix.py
test_pattern
function in check_pattern
to avoid misunderstanding with the tests function and put it in a errors_utils.py
or inside the errors.py
itselfsegmentations_utils.py
and put the PopularityPercentage
function there. It should not be considered as a subclass of the Segmentation one, I think it could simply be a function.Another idea could be to completely reformat the directories in this way:
recsyslearn
└─── errors
│ │ errors.py
│ │ utils.py
│
└─── metrics
│ │ Entropy.py
│ │ ...other metrics
│ │ utils.py
│
└─── segmentations
│ │ ActivitySegmentation.py
│ │ ...other segmentations
│ │ utils.py
Do you like these options?
I like the second option, but I think the last folder should not be called segmentations, but something like activity. PopularityPercentage
and InteractionSegmentation
are both measures of popularity/"activity" of the item, but the first one is not a segmentation.
The main idea is to put into segmentations all the possible segmentations for a given dataset, and move the functions such as PopularityPercentage
into the utils.py
because it can be considered as an helper for the segmentations. I don't know if it's clear and you like it
I think PopularityPercentage
should be considered as a measure of Popularity, as much as InteractionSegmentation
is. They are at the same level in terms of usage of the library (they both assign a score of popularity to each item, one as a percentage, and one as a kind of "percentile"). Therefore I would not put PopularityPercentage
in utils.py
.
As we already discussed, the idea is to use this as score to then segment the items/user based on popularity. Don't you think is more correct to put this class in a separate file? I don't think it's semantically correct to consider it as a segmentation technique