AIS-22 / UNI-AIS-BiometricSystems

1 stars 0 forks source link

Dataset from A. Uhl #3

Open m-langer opened 1 year ago

m-langer commented 1 year ago

@AIS-22/biometricsystems

Betrifft: LV 911.100 23W 1SSt PS Biometric Systems

Dear Students,

pls find the data @ https://www.cosy.sbg.ac.at/~uhl/Data_prepared.zip

so far, ps only consider original and spoofed data, but NOT the synthetic data, as the directory structure of the latter is not yet fully clarified.

best, AU

AleksandarRa commented 1 year ago

Dataset Description

The dataset consists of four datasets – pls ignore the “PROTECT” one, as these samples are of different type (i.e. dorsal handvein samples instead of fingervein samples as for the other three datasets).

Each folder contains subfolders for the genuine examples (bona fide), the spoofed ones and the synthetic ones:

where GANmethod describes the GAN used to generate the data. Currently, we have cycleGAN for each dataset, but additionally distanceGAN only for the PLUS dataset (so if you should compare generation methods, this is your dataset).

Synthetic subfolders

For the synthetic subfolders there are further subfolders where the first one indicates the ID of the variant according to the table on the first page of Description.pdf (see also this file !) Then there are usually 5 subfolders according to the folds used for generation, where the _rs subfolders indicate the images that are resized and the all_rs containing images combined from all those runs.

Each one then again contains a reference subfolder, where the synthetic samples are stored.

An example path would be: PROTECT/spoofed_synthethic_cyclegan/010/3_rs/reference

where the data is from the PROTECT dataset, it is the synthetic spoofs generated by cyclegan, variant 010, fold 3, resized images

Which folder data should you use ?

DO use all 5 subfolders according to the normal folds, DO NOT use the 5 _rs subfolder DO NOT use the all_rs subfolder

Which variant (ID) should you be using ?

PLUS: 003 and 004 (different generation variants wrt. fold construction, separate experiments, do not mix) SCUT: 007 and 008 (different generation variants wrt. fold construction, separate experiments, do not mix) VERA: 009

Filename description


In the following, you find a description of how the file names of the samples are composed for better understanding of subject IDs (if you need to separate subjects into training and testing subjects).

The Filename format for the particular datasets and synthetic samples is the following:

All filenames start with a consecutive number separated by a "-" from the original dataset's naming scheme e.g. 001-PLUS-FV3-Laser_PALMAR_001_01_02_01.png where "001" is the consecutive number and "PLUS-FV3-Laser_PALMAR_001_01_02_01.png" is the original filename, according to the format of the datasets:

PLUSVein-FV3

https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwavelab.at%2Fsources%2FPLUSVein-FV3%2F&data=05%7C01%7Cs1093311%40plusacat.mail.onmicrosoft.com%7Cb073cc297102418f2aef08dbdd49feee%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638347082137259725%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=J2sbE5BZfBp3Fw1R%2FCp3E%2F6RDqwRNwKrl9OE9Sz%2FcVY%3D&reserved=0

The filename are encoded using the following structure: [scanner name][DORSAL/PALMAR][session ID][user ID][finger ID]_[image ID].png an example filename is: PLUS-FV3-Laser_PALMAR_001_01_02_01.png

PROTECT

The filenames are encoded using the following structure: c[session ID]_b[sensor_ID]_su[user ID][finger ID]-[image ID].png where

example file name: c1_bF_s_u000_02-000.png

IDIAP Vera FV Spoofing Attack

https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.idiap.ch%2Fen%2Fdataset%2Fvera-fingervein&data=05%7C01%7Cs1093311%40plusacat.mail.onmicrosoft.com%7Cb073cc297102418f2aef08dbdd49feee%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638347082137416038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WA%2Ft%2BUaQT8AiCmcgSViUjxiMDnAQZy6dYOdXUreV%2Fag%3D&reserved=0

Samples are stored as follow with the following filename convention: full/bf/004-F/004_L2. The fields can be interpreted as //-/_. The represents one of two options full or cropped. The images in the full directory contain the full image produced by the sensor. The images in the cropped directory represent pre-cropped region-of-interests (RoI) which can be directly used for feature extraction without region-of-interest detection. We provide both verification and presentation-attack detection protocols for full or cropped versions of the images.

The field may one of bf (bona fide) or pa (presentation attack) and represent the genuiness of the image. Naturally, biometric recognition uses only images of the bf folder for all protocols as indicated below. The is a 3 digits number that stands for the subject's unique identifier. The value can be either M (male) or F (female). The corresponds to the index finger side and can be set to either "R" or "L" ("Right" or "Left"). The corresponds to either the first (1) or the second (2) time the subject interacted with the device.

SCUT FVD Spoofed

https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBIP-Lab%2FSCUT-SFVD&data=05%7C01%7Cs1093311%40plusacat.mail.onmicrosoft.com%7Cb073cc297102418f2aef08dbdd49feee%7C158a941a576e4e87993db2eab8526e50%7C1%7C0%7C638347082137416038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zTR1qojd1Jwbek8C5U6VRBHzDgdGF%2F26Nhx1odmcJU0%3D&reserved=0

Images are labeled as follow: ID_finger_session_shot_light.bmp, where “ID” stands for client's ID, “finger” ranges from 1 to 6 standing for the index, middle and ring finger of right and left hand respectively, “session” stands for session number which can be "0" or "1" “shot” stands for the considered shot number ranging from 0 to 5, and “light” stands for the level of light intensity which can be an integer between 1 and 6.

Images from the same client are regrouped into a single folder labeled.

NOTE: When processing these images, you only need to consider the first two label, i.e. “ID” and “finger”.

AleksandarRa commented 1 year ago

Description.pdf

Databases:

All datasets used in the experiments are derived from the databases listed below. The images of those databases are saved in the respective folders. • PLUS • SCUT • VERA • PROTECT

Datasets:

The figure below gives an overview of all used datasets. The datasets are stored in the “Datasets” folder, grouped by their Databases, and named by the ID listed in the figure.

image

Datasets structure:

image

Results:

The “Results” folder is grouped by the networks and then by their datasets and experiment number. An overview of the executed experiments is shown in the figure below. image here are two subfolders for each executed experiment. The “evaluation” and the “output” folder. First one holds the plots and tables from each single run of the 5-folds and from the all-run. The “OverviewImage.png” shows four example outputs from the network in comparison to the genuine and manually spoofed images. The “OverviewTableCompare.tex” holds the important values from the matching scores from a single run of the 5-folds. “OverviewTableCompareAll.tex” holds the values from all runs together. The “output” folder holds all images created by the network from all runs of the 5-folds. There is also a subfolder, containing images combined from all those runs. The “_rs” folders are containing the resized images from the single runs.

AleksandarRa commented 11 months ago

new dataset has no VERA. Which variant(ID) should we use for PROTECT and IDAP