2023-ACM-A Variational Neural Architecture for Skill-based Team Formation

thangk commented 1 month ago

Link: https://dl.acm.org/doi/10.1145/3589762

Main problem

Forming successful teams of experts for projects isn't straightforward and complicated. Existing models are often graph-based operations and are not the most efficient nor most accurate when forming teams. So, there is room for improvement in the team formation tasks problems.

Proposed method

The author proposes a method that accounts for the past collaborations of the experts in past teams for future team prediction—a variational Bayesian neural network. The proposed method aims to form teams more accurately while fulfilling the required skills in a team with past collaboration information.

My Summary

In this paper, the proposed methods outperformed the baselines on all ranking and quality metrics. The training process is also faster and more efficient. There were some drawbacks to the proposed method. When there are similar skill keywords, they're considered different skills which can be as simple as "server admin" and "server administrator". This is something the paper needs more work on but it suggests it can be resolved by adding a word analysis step in the pipeline so the skills are cleaned before going into the neural network.

Datasets

DBLP (33,002 teams, 2,000 skills, and 2,470 experts)
Dota2 (6,390 teams, 3,005 skills, and 2,727 experts)

hosseinfani commented 1 month ago

Hi @thangk thanks for the summary. Are you able to track down the Dota2 dataset and add it to our pipeline as a new dataset?

thangk commented 1 month ago

Hi @thangk thanks for the summary. Are you able to track down the Dota2 dataset and add it to our pipeline as a new dataset?

I'm not sure if I'd be able to find the author's cleaned Dota2 dataset.

I checked his Github repo and he's got only the dblp and imdb datasets there, not Dota2. https://github.com/radinhamidi/A-Variational-Neural-Architecture-for-Skill-based-Team-Formation/tree/main/dataset

Also, his source for the raw version is here and it seems like he's done a lot of work to clean his data based on what's posted here (link is from the paper) https://www.kaggle.com/datasets/devinanzelmo/dota-2-matches/data

hosseinfani commented 1 month ago

@thangk So, the dataset is available but not in our format, right? pls explore how can we map it to our definition of teams with experts and skills, e.g., players would be the experts, ... Would be great if we have a dataset in online gaming.

thangk commented 1 month ago

So, the dataset is available but not in our format, right? pls explore how can we map it to our definition of teams with experts and skills, e.g., players would be the experts, ... Would be great if we have a dataset in online gaming.

@hosseinfani Sure, I can look into it. May I know who/how our filtered datasets were made? maybe I can follow a similar process.

hosseinfani commented 1 month ago

Myself. I can explain the code for you at lab

fani-lab / OpeNTF