huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.24k stars 222 forks source link

Item taxonomy generation #523

Open ya-stack opened 6 months ago

ya-stack commented 6 months ago

HI @tomaarsen, I'm working with hierarchical data, specifically item taxonomy data, and my goal is to predict four levels: Product type, Product subtype, Merchandise type, and item type, based on product descriptions and titles. I'm seeking advice on how to prepare the data for this multi-label classification problem while preserving the hierarchical structure. For instance, if the product type is "furniture," the model should classify the product subtype within the furniture category, and similarly for merchandise type and item type. Below is a snippet of the data -

tcin | product_type_n | product_sub_type_n | merchandise_type_n | item_type_n XYZ | HOME | BEDDING | blankets and throws | Throw Blankets BCD | HOME | SOFT HOME | rugs, mats and grips | Rugs PQR | FURNITURE | seating and tables | standalone tables | Console Tables ABC | HOME | SOFT HOME | rugs, mats and grips | Rugs EFG | FURNITURE | bedroom furniture | beds and mattresses | Beds

Thanks in advance.