Open Yasmen-Wahba opened 1 year ago
Hi @Yasmen-Wahba,
This is an interesting topic and I believe it is possible. However, we are currently working on multi-label classification and do not have the bandwidth for this feature, but I will leave this issue open and likely work on it in the future.
There was someone who volunteered to implement this on GitLab, but I am not sure if he managed to do it.
Out of curiosity, how much time do you estimate would it save you if you can skip retraining the model?
Actually, it's not about time. The current model I deployed is linear, so it's very fast and its accuracy is very satisfactory. The problem lies in the nature/distribution of the incoming data, where new classes are being added and my model knows nothing about them. That's why I thought about Incremental/Continual/Lifelong Learning to process data as streams and to adapt to new changes without the need to retrain the model each time a new class is added!
Actually, it's not about time. The current model I deployed is linear, so it's very fast and its accuracy is very satisfactory. The problem lies in the nature/distribution of the incoming data, where new classes are being added and my model knows nothing about them. That's why I thought about Incremental/Continual/Lifelong Learning to process data as streams and to adapt to new changes without the need to retrain the model each time a new class is added!
Let me see if I understood correctly... For example, if you train the model with classes A and B, then later you only want to train with class C, without adding new samples for classes A and B, right? If that is the case, then I believe it would be easier to implement and I could try to do next week.
I want to be able to train the models with class C or with class C and D. I want my model to accept new class(es) and continue training without complaining.
I want to be able to train the models with class C or with class C and D. I want my model to accept new class(es) and continue training without complaining.
Hi @Yasmen-Wahba,
I only have a couple more questions before I start implementing this.
Do you think warm_start would be a good option for your use case? For example:
lcpn = LocalClassifierPerNode(
local_classifier=SVM(...),
warm_start=True,
)
lcpn.fit(x, y)
# a few days later...
lcpn.fit(new_x, new_y)
Will you be able to ensure that only new data is being used in subsequent calls to fit
, e.g., only classes C and D in your example? Or do you think it is better to check and skip classes for nodes that were already fitted previously?
Hi Fabio. It would be great if we can accept old and new classes, check for old and skip and fit only new classes :)
I was wondering if it would be possible for HiClass to incorporate Online/Continual Learning to process streams of data. Something like what is implemented in a library like "avalanche" https://github.com/ContinualAI/avalanche