During my internship, I have the chance to be involved in the automl multimodal classification setting, where the modality includes tabular, text, and image. There already exist many text--image datasets (images with captions, visual question answering), and text--tabular datasets (Multimodal AutoML on Structured Tables with Text Fields). However, there is no comprehensive publically available table, text, and image datasets, which is quite unique and challenging itself.

After some discussion with Wenjing, we find out games like HearthStone, Magic, DOTA, LoL are ideal resources for this multimodal classification testbeds.

For instance, below are several cards from HeathStone. They contain tabular features (cost, attack, HP, card type, minion type, etc.); image features; textual features (descriptions of its effect, or background story). It is relatively easy to convert the data into a classification task (e.g. we can predict the cost of a card by all other features).

I was wondering if you are interested in a resource paper, which I think might be a low-hanging fruit to achieve. If multimodal classification is not very interested in our lab direction. there might be a chance that we can convert it to a graph problem or even multimodal KG/KB (I would recommend after releasing the resource paper and using some existing multimodal automl models to set up baselines).

The risk I can name now is mainly the copyright risk. But I do find some Common-Creative copyright resources we can use. I would like to invite my wife as one of the co-authors for the resource paper because she contributes a lot to shape this idea. Moreover, the beginning stage looks relatively easy, so if it is possible I may want to hire some undergraduates to help.

Aug 18

Project Deliverables

Benchmark Dataset, with polished tasks
Existing AutoML Tool Exp results
A paper submitted to ICML'23, Automl CC'23, or KDD'23
- call for resource paper: CIKM (May), SIGIR (Feb, SIGIR'22 papers, CFP), NeurIPS(22 example)
- paper examples: https://dl-acm-org.proxy.library.emory.edu/doi/pdf/10.1145/3477495.3532019, https://arxiv.org/pdf/2202.11684.pdf

Project Status

Form a team: 1-2 undergraduate students openings

Aug - Sep Plan

Collect data from 1~2 games
Figure out the copyright issue
Get used to automl tools such as AutoGluon and FLAML

Aug 20- 27 Plan

I'd suggest the following things for next week

Data Collection (high-priority): we need to pay extra attention to copyright issue
- pokemon: https://bulbapedia.bulbagarden.net/wiki/Main_Page
- ~animal crossing~: https://docs.google.com/spreadsheets/d/13d_LAJPlxMa_DubPTuirkIV4DERBMXbrWQsmSh8ReK4/edit#gid=1022368750
- ~DOTA~: ~123 heroes; https://github.com/kriskate/dota-data (old); https://dota2.fandom.com/wiki/Table_of_hero_attributes
- ~LOL~: 161 heroes; https://www.op.gg/champions
- HearthStone https://hearthstonejson.com/ (first reference from: https://github.com/deepmind/card2code)
- Magic: https://scryfall.com/docs/api/bulk-data
- Farming Simulator Equipment: https://farmingsimulator.fandom.com/wiki/Equipment/Farming_Simulator_22, price prediction (regression), said over 400 equpiments
- Diablo II Items: https://diablo-archive.fandom.com/wiki/Items_(Diablo_II), Quality Level or Level Requirement prediction
Paper Reading
- AutoGluon Tabular + Text paper: https://openreview.net/pdf?id=OHAIVOOl7Vl. It is ok just to read the problem setting, not necessary to understand the technical detail
Try to install AutoGluon and go through several tutorials (low-priority)
- https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-quickstart.html: this is a light example that we can run even on laptop
- https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-multimodal-text-others.html: this requires GPU machine. Let me know if you want to use Emory machine or Google Colab
Test Bed:
- CPU only
- CPU+GPU

Meeting Notes

Train 80%, Val 5%, Test 15%
label distribution: we don't want zero-shot.
we need a image_col to store image path, for simplicity we can use image
TODO next week: Animal crossing pick 2~3 tasks, store all in raw.csv, images/
- villager gender prediction: class label, train.csv, dev.csv, test.csv, images/train_image, dev_image...
- villager species prediction: ...
DOTA also has pre-processed data
- predict hero type/role?

Aug 28 - Sep 2

Observations

AnimalCrossing
DOTA

To discuss

[x] AnimalCrossing_Gender
- [x] total 416 samples, 416*0.15~=62; Got 42 for test set
- [x] duplicate cols: "Image Path", "Filename" -> let's just use "Image" (case sensitive)
  - [x] in col Image, we want to have relative path like dev_images/bul05.png
- [x] now uncompress "dev_images.zip", I got Villagers_Gender_dev. We want a uniform directory name "dev_images" so scripts can be easily scaled.
[x] Baselines (assume we use python3.9)
- [x] AutoGluon best_preset, CPU and GPU version. @lujiaying finish skeleton
[x] Move to Pokemon for a large dataset.

Next week plan

[x] @qyccc3 Explore Pokemon; try to avoid crawler. Can we directly download the whole website?
- plan: webpage with image here; available csv can be found here
[x] @qyccc3 polish animal crossing dataset with 80%/5%/15%, and test the performance
[x] @lujiaying explore where can we download magic: 72200 cards!!

Sep 3 - Sep 9

To discuss

[x] info.txt is a great idea. We indeed need it to refer to which col is label. I'd recommend also adding label column: Gender for user.
[x] AnimalCrossing_Gender (maybe we can consider removing this task)
- [x] test.csv "Image Path" gender_images/cat02.png --> test_images/cat02.png
- [x] feature importance

                  importance    stddev       p_value  n  p99_high   p99_low
Personality         0.409677  0.008834  2.593102e-08  5  0.427867  0.391488
Hobby               0.035484  0.021030  9.777106e-03  5  0.078784 -0.007817
Species             0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Subtype             0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Birthday            0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Catchphrase         0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Favorite Song       0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Favorite Saying     0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Style 1             0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Style 2             0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Color 1             0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Color 2             0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Default Umbrella    0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Wallpaper           0.000000  0.000000  5.000000e-01  5  0.000000  0.000000
Flooring            0.000000  0.000000  5.000000e-01  5  0.000000  0.000000

[x] AnimalCrossing_Gender
- [x] Image Path: dev_images/crd01.png.png
[x] Pokemon type prediction
- [x] we want to collect 905 pokemons from URL. Right now no need to consider "Generation_IX"
- [x] it would be great if we could get "learn set" attribute, e.g. example URL
- [x] it is fine to either use existing CSVs or crawl by ourselves.
[x] HearthStone or Magic can be next dataset, after pokemon.
- [x] Please be aware to crawler policy (max qps bound)
[x] autogluon exec.py
- [x] we can just save the chosen/best model of autogluon for now. It would be great to save exp result under the exp_save_dir/exp_results.csv,

Sep 10 - 16

To-Dos

[x] determine pokemon problem/task
- [x] it is indeed possible to do multi-label: https://auto.gluon.ai/dev/tutorials/tabular_prediction/tabular-multilabel.html# @qyccc3 let me know if you wanna try to write it. Or I can do it for you.
[x] prepare all cards in hearthstone: class (belong to Druid or Mage?), card rarity, minor race.
[x] try out FLAML or autosklearn (paper link: https://arxiv.org/pdf/2207.12560.pdf)

Sep 24 - 30

[x] exec.py: @lujiaying
[x] split_dataset.py: @lujiaying
[x] AnimalCrossing_Species: test with image @lujiaying
- [x] @qyccc3 train.csv is fine; dev.csv, test.csv contains one extra column.. which is the row index. We probably not need it.
- Decision: AnimalCrossing Species is not a good benchmark task
[x] Pokemon ready (no need to split): @lujiaying has csv to determine columns, then hand to Yongchen for image and final clean
- [x] @qyccc3 remove these columns:egg_type_number, egg_type1, egg_type2, type_number, against_normal, againstfire, ..., against*
- [x] @qyccc3 please upload a version, and it would be great if this version can pass dutils/validate_dataset.py

[x] Hearthstone ready: @qyccc3 we want Hearthstone (including minion, spell, weapon and location); and also minion subset, spell subset.

[x] enhancement: No need to include enhancement
[ ] No_Mechanics shows in several rows? Are they filled by us?
[X] Please check validity of Hearthstone.. All-rarity could not PASS validity check. Double checked: name contains duplicates. Do they have unique id from raw data? We may want to keep them

--> ID column is Image Path


[INFO] Input arguments: Namespace(dataset_dir='datasets/Hearthstone-All/rarity', id_col='name', label_cols=['rarity'])
Following rows from dev EXIST in train!!
*** Please remove the overlaps ***
cardClass  health                     name                 set    type  attack  ...  mechanics_2        race durability element description                          Image Path
0     NEUTRAL     4.0             Sub Scrubber       BATTLEGROUNDS  MINION     4.0  ...          NaN  MECHANICAL        NaN     NaN         NaN  dev_images/BG22_HERO_200_Buddy.jpg
2      HUNTER     NaN  Dragonslayer's Greatbow  YEAR_OF_THE_DRAGON  WEAPON     6.0  ...   ['IMMUNE']         NaN        2.0     NaN         NaN       dev_images/DRGA_BOSS_22t4.jpg
5      SHAMAN     2.0             Glugg's Tail     THE_SUNKEN_CITY  MINION     2.0  ...          NaN       BEAST        NaN     NaN         NaN            dev_images/TSC_639t3.jpg
7     PALADIN     NaN        Hand of Salvation       BATTLEGROUNDS   SPELL     NaN  ...          NaN         NaN        NaN    HOLY         NaN  dev_images/TB_Bacon_Secrets_11.jpg
11    WARLOCK     3.0               Felstalker              LEGACY  MINION     4.0  ...          NaN       DEMON        NaN     NaN         NaN              dev_images/EX1_306.jpg
..        ...     ...                      ...                 ...     ...     ...  ...          ...         ...        ...     ...         ...                                 ...
515   NEUTRAL     8.0              Alexstrasza             VANILLA  MINION     8.0  ...          NaN      DRAGON        NaN     NaN         NaN          dev_images/VAN_EX1_561.jpg
520   NEUTRAL    69.0                 Ragnaros             LETTUCE  MINION     9.0  ...          NaN   ELEMENTAL        NaN     NaN         NaN         dev_images/LETL_028H_01.jpg
521   NEUTRAL     5.0         Elise Starseeker                 LOE  MINION     3.0  ...          NaN   None_Race        NaN     NaN         NaN              dev_images/LOE_079.jpg
528   NEUTRAL     5.0            Queen Azshara     THE_SUNKEN_CITY  MINION     5.0  ...          NaN        NAGA        NaN     NaN         NaN              dev_images/TSC_641.jpg
533   WARRIOR     5.0           Darius Crowley                CORE  MINION     4.0  ...          NaN   None_Race        NaN     NaN         NaN         dev_images/CORE_GIL_547.jpg

[210 rows x 16 columns]



- [x] Currently, we may not need to upload trained artifacts into the cloud folder. Or If we upload, it would be great to use `exec.py` to save exp_arguments and exp_results.
- [x] Discuss whether we set it as a multi-column prediction or just multiple tasks (no dependency among different tasks)
- [x] Create a Turing server account for Yongchen, because we want to have a fair comparison between different baselines (same CPU cores, same GPU core). @lujiaying need to discuss this with Dr. Yang.

## own algorithm idea
auto-gnn: automatically construct a graph by categorical feature. Then it would be a heterogeneous graph with different types of nodes and types of edges.

Sep 31 - Oct 6

TODOs:

Wrap up Pokemon and Hearthstone Datasets (every task is a unique dataset)
- [ ] @qyccc3 all passed validation using https://github.com/lujiaying/MUG-Bench/blob/master/run_dutils.sh
  - [x] Hearthstone_all_cardClass
  - [x] Hearthstone_all_cost
  - [x] Hearthstone_all_rarity
  - [x] Hearthstone_all_set
  - [x] Hearthstone_minion_attack
  - [ ] Hearthstone_minion_race
  - [x] Hearthstone_minion_health
  - [x] Hearthstone_spell_spellSchool
- [x] info.txt: @qyccc3 please contain
  - [x] id_col(this one is new),
  - [x] eval_metric (new to add, for binary we use auc, for multiclass we use log_loss, refer to #5 for details ),
  - label_col, and the label distribution (I believe these two are already finished)
- [x] Pokemon type_1, type_2: we remove type_2 when predicting type_1, vice versa. @lujiaying I can do that.
  - [ ] Not sure if we can use type_2 after looking into the label distribution
- [ ] HearthStone @qyccc3 :
  - [x] please add a id column (just use their id from original source), add a artist column
  - [x] For some task that contains too many arbitrary labels (e.g. Health), how about we split them into 0, 1, 2, 3, ..., 10, 11~20, >20? A similar re-category can be done for Attack, Cost predictions.
  - [ ] continue verify the datasets by comparing tabular v.s. multimodal medium quality
  - [ ] pokemon, hearthstone-all @lujiaying
  - [x] hearstone-minion, heartstone-spell @qyccc3
New game data to be included
- [ ] Magic would be next one
- [ ] Any other possible choices?
- [x] League of Legends skin
Set up Hopper Server Running scripts

lujiaying / MUG-Bench

Proposal #1

Aug 18

Aug 20- 27 Plan

Meeting Notes

Aug 28 - Sep 2

Observations

To discuss

Next week plan

Sep 3 - Sep 9

To discuss

Sep 10 - 16

To-Dos

Sep 24 - 30

Sep 31 - Oct 6