Closed vitalwarley closed 10 months ago
Estou atualmente na seção 4.2, que trata da função de perda. Um dos parágrafos ficou relativamente difícil de entender. Usei o app consensus dentro do ChatGPT para obter esclarecimentos. A resposta é bem fundamentada, onde há as citações dos trabalhos que fundamentaram os conceitos necessários ao meu entendimento.
Um trecho:
The concept of fairness-aware contrastive loss function in facial recognition, as described in your query, involves several technical aspects: larger gradients, similarity to margin penalty, balancing unfairness, and achieving consistent compactness across races.
Mais detalhes no link anterior.
Reference 22, Fair contrastive learning for facial attribute classification, exploits the interrelation between anchor and sample to design a sensitive attribute removing loss function.
Reference 42, Consistent instance false positive improves fairness in face recognition, uses instance FPR in loss function to constrain bias.
Both 22 and 42 proposed a fairness-aware loss function.
Reference 39, Fairness-aware adversarial perturbation towards bias mitigation for deployed deep models, implemented post-processing data perturbation without changing their parameters and structures that can hide the information of protected attributes attributes.
Reference 11, Mitigating face recognition bias via group adaptive classifier, proposed to include demographic-adaptive layers that make the model generate face representations for every demographic group.
It is important to make the model focus on the most critical regions. Siamese nets and attention schemes are popular methods in kinship verification because they focus on similar facial traits.
They reverse the gradient of the race classification branch to remove the racial information in the feature vector.
They design a fairness-aware contrastive loss function that can mitigate pairwise bias and significantly decrease the standard deviation in four races.
The first work to propose to mitigate bias and achieve SOTA accuracy simultaneously for kinship verification.
A fairness-aware contrastive loss function that mitigates the pairwise bias and balances the degree of compactness of every race, which improves racial fairness.
A large kinship dataset with racial labels from several public kinship datasets.
Deep fusion siamese network for automatic kinship verification (2020)
- Proposed a feature fusion method that uses discriminative features from backbone network and fuses the features to determine if two face images are with kinship relationship or not.
Supervised Contrastive Learning for Facial Kinship Recognition (2021)
- Adopted ArcFace as the backbone model pre-trained on MS-Celeb-1M to obtain more representative features. Moreover, they used a supervised contrastive loss function to contrast samples against each other and a hyperparameter ($\tau$, temperature) to focus on hard samples, thus enhancing the ability to distinguish the kinship relation.
Kinship representation learning with face componential relation (2023)
- Successfully enhance the accuracy of kinship verification task by leveraging attention mechanism. They combined attention mechanism with backbone to focus on the most discriminative part (e.g., five senses) of facial image. They also proposed a new loss function that combined contrastive loss and the attention map they created from the attention mechanism.
Innovations to tackle racial bias with AI systems happen in developing new algorithms (how so?), new model architectures, or novel loss functions.
Adversarial learning with gradient reversal layer to learn fair features.
Gradient reversal against discrimination (2018)
- They devised fair features using an adversarial learning technique. This method involved the incorporation of a gradient reversal layer, effectively flipping the gradient of the classification head for sensitive attributes. This strategic move encouraged the model’s encoder to generate features devoid of sensitive information, thus reducing potential bias.
Adversarial learning to attain discriminative features while disentangling features into four crucial attributes
Jointly debiasing face recognition and demographic attribute estimation (2020)
- They leveraged adversarial learning to attain discriminative feature representation, simultaneously disentangling features into four distinct attributes. This process of disentanglement aimed to preserve crucial attributes while discarding unfair ones. By carefully manipulating the feature space, the model could successfully eliminate biases linked with sensitive attributes.
Adversarial learning to conceal information associated with fairness-related attributes (e.g. race, skin color, gender, age, etc.) by input perturbation
Fairness-aware adversarial perturbation towards bias mitigation for deployed deep models (2022)
- They introduced an approach with the aim of mitigating bias in deployed models. Unlike previous state-of-the-art methods that focused on altering the deployed models, they took a different route by concentrating on perturbing inputs. They employed a discriminator trained to differentiate fairness-related attributes from latent representations within the deployed models. Simultaneously, an adversarially trained generator worked to deceive the discriminator, ultimately generating perturbations that can conceal the information associated with protected attributes.
Adversarial learning with adaptive layers to enhance representation robustness for different demographic groups
Mitigating face recognition bias via group adaptive classifier
- In addition to the use of adversarial learning, they proposed the incorporation of adaptive layers within the model structure. The introduced adaptive layer aimed to enhance representation robustness for different demographic groups. An automation module was integrated to determine the optimal usage of adaptive layers in various model layers, dynamically adjusting the network’s behavior to cater to the unique requirements of different groups.
Softmax loss function with instance False Positive Rate
Consistent instance false positive improves fairness in face recognition
- Another approach involved the modification of the softmax loss function with a novel penalty term to mitigate bias while concurrently improving accuracy. They achieved this by utilizing instance False Positive Rate as a surrogate for demographic False Positive Rate, eliminating the need for explicit demographic group labels.
Could this strategy be used for other biases, like gender and age, where we do not necessarily have ground-truth labels?
A novel loss function combining CosFace with bias difference to minimize identity bias
MixFairFace: Towards Ultimate Fairness via MixFair Adapter in Face Recognition (2022)
- They shifted their focus from demographic group bias to identity bias. They combined the CosFace [36] with bias difference to create a novel loss function. Their belief was that by targeting identity bias, they could solve the problem of skewed outcomes and treated all individuals impartially, striving for a comprehensive fairness that not dividing people based on their races. This innovative approach minimized identity bias without requiring sensitive attribute labels, thereby effectively enhancing fairness between demographic groups.
The authors, then, propose to integrate fairness and accuracy, aiming to improve both aspects. They do so by using adversarial learning with a fairness-aware loss function in a multi-task model structure with an attention mechanism.
KinRace is composed of six datasets: CornellKin, UBKinFace, KinFaceW-I, KinFaceW-II, Family101, and FIW.
They use only the main kinship types: FS, FD, MS, MD.
They limit the total number of images for each identity to at most 30.
They label each sample manually with four faces: African, Asian, Caucasian, and Indian.
To mitigate the other-race effect, they use three different racial annotators. The ground truth is determined by the majority. If there is no majority, the identity is not used.
KinFace racial distribution follows BUPT-Globalface, which is approximately the same as the real distribution.
Mixed-race positive pairs are removed.
They created KinRace because of the absence of race labels in kinship datasets. Also, they use four races to enable studies on the same benchmark.
They manage to reduce race bias, but identity bias still exists, albeit limiting it to 30 images per person.
Data quality alone doesn't significantly improve results, but being crucial to face verification, the authors plan to explore it in future works.
Certain facial features used to determine kinship might be closely linked with racial characteristics. When these racial characteristics are deliberately obscured to avoid bias, the model may lose some of the information that was helping it accurately verify kinship.
The authors to build on top of two previous work: supervised contrastive loss by Supervised Contrastive Learning for Facial Kinship Recognition (2021) and a loss with a debias term by MixFairFace: Towards Ultimate Fairness via MixFair Adapter in Face Recognition (2022).
They propose the fairness-aware contrastive loss function: $L_{\text{fairness}} = -\log \frac{e^{\left(\cos(x_i,y_i) - bi\right)/\tau}}{\sum{j\neq i}^N \left[e^{\cos(x_i,x_j)/\tau} + e^{\cos(x_i,y_j)/\tau}\right] + e^{\left(\cos(x_i,y_i)-b_i\right)/\tau}}$ where $b_i$ is the averaging $\epsilon$ in $\cos(M(f_m), M(f_i))^2 - \cos(M(f_m), M(fj))^2 = \epsilon$ for the batch. They use the cross-entropy loss to train the race classifier: $L{\text{race}} = - \sum_{i=1}^n t_i \log(pi)$. The total loss is $L{\text{total}} = L{\text{fairness}} + L{\text{race}}$.
$(x_i, y_i) $ are positive pairs, while $(x_i,x_j)$ and $(x_i, yj){(j \neq i)}$ are negative pairs. See Zhang et al. - 2021 - Supervised Contrastive Learning for Facial Kinship
$f_m = \frac{1}{2}(f_i + f_j)$ and $M(.) $ is the debias layer. If $\epsilon > 0$ then $i$ has a large (face recognition) bias than $j$. See MixFairFace: Towards Ultimate Fairness via MixFair Adapter in Face Recognition (2022).
MixFairFace: Towards Ultimate Fairness via MixFair Adapter in Face Recognition (2022) proposed loss function aims to reduce identity bias. How race is included in $L_{\text{fairness}}$ then?
In the original paper, the authors defined identity bias as the performance variance between "each identity". How can we understand them in the context of kinship verification? The performance variance between "each kinship"? The debias layer receives both feature vectors; they represent a positive or negative pair.
Further in the paper, section 4.2, they explain identity biases as those "introduced by their races, genders, or other individual differences".
They build upon Understanding the behaviour of contrastive loss (2021) to validated the idea that positive bias $bi$ means a stronger learning signal ($P{i,j}$ is larger) for positive and negative pairs.
$\frac{\partial L(x_i)}{\partial \cos(x_i, xi)} = -\frac{1}{\tau} \sum{k \neq i} P_{i,k}, \quad \frac{\partial L(x_i)}{\partial \cos(x_i, xj)} = \frac{1}{\tau} P{i,j} $, gradients concerning positive and different negative samples, respectively. $P_{i,j}$ is the probability of $x_i$ and $x_j$ being recognized as positive pair.
They show the change only in $P_{i,j}$, but I think they also add $bi$ to $P{i,k}$. Otherwise, it doesn't make sense because the former relates to different negative samples.
- This work employs two methods for improving fairness: adversarial learning and fair loss function. We use a race classifier in adversarial learning to remove racial information from feature vectors, which decreases standard deviation.
ArcFace model; feature maps and feature vector of size $\mathbb{R}^{7\times7\times512}$ and $\mathbb{R}^{512}$, respectively.
$\tau = 0.08$ (follows Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning).
SGD with momentum = 0.9 and weight decay = 1e-4.
10 epochs with 60000 steps; batch size = 25.
Baseline as the SOTA2021 (Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning).
- In the experiment below, if we mention adversarial, it means we reverse the gradient of race classification like the red line in the indication in Figure 2. If we mention multi-task, it means we do not reverse the gradient of race classification, instead we just train the model normally with the green line in the indication in Figure 2.
Both fairness strategy (gradient reversal and debias layer) mitigate bias (reduces standard deviation), but also harms accuracy. By merging both strategies they remarkably reduce standard deviation while boosting the accuracy.
- Firstly, the feature vector excludes racial information, which benefits from adversarial learning. Secondly, the debias layer becomes more robust because it can generate debias term depending on the most essential facial features while racial traits are removed.
Their overall strategy enhances fairness while maintaining accuracy.
How can the debias layer generate a debias term if the feature vector has no racial information?
They compare their method with three other works (Achieving Better Kinship Recognition Through Better Baseline, Deep fusion siamese network for automatic kinship verification (2020), Supervised Contrastive Learning for Facial Kinship Recognition (2021)) that performed well in the RFIW challenge.
Evaluate the generalization of the method on other datasets: UB KinFace and FIW.
They note that their results are competitive with Kinship representation learning with face componential relation (2023) and claim better results because of the use of the ArcFace backbone.
The paper titled "KFC: Kinship Verification with Fair Contrastive Loss and Multi-Task Learning" by Jia Luo Peng, Keng Wei Chang, and Shang-Hong Lai, addresses the challenge of kinship verification in the presence of biases associated with gender, ethnicity, and age due to the lack of large-scale, diverse datasets. The authors propose a comprehensive solution involving a multi-task learning architecture with an attention module and introduce a fairness-aware contrastive loss function that incorporates a debiasing term with adversarial learning. The approach is evaluated on a newly constructed dataset named KinRace, designed to be robust against race-related biases.
The model's architecture is adept at counteracting biases while improving kinship verification performance. By combining gradient reversal and a fairness-aware contrastive loss function, the model can mitigate racial biases effectively without compromising the accuracy.
The attention module in the multi-task architecture concentrates on the relevant facial features, allowing for discrimination of kin relationships without racial information influencing the decision-making process.
The novel loss function proposed rightly extends previous work on supervised contrastive loss and debiasing terms, addressing both the accuracy and fairness in kinship verification, which had previously been handled separately.
The dataset KinRace has been carefully curated to represent different races evenly and excludes mixed-race pairs to ensure clarity in racial categories. This attention to detail underlines the importance of dataset quality in machine learning tasks, especially those sensitive to biases.
In relation to the KinRace dataset, further research could focus on including mixed-racial pairs and how the model would perform in kinship verification in more complex, diverse familial backgrounds.
Investigate how the debias layer functions when racial information has been extracted. Can the model still effectively generate debias terms based on non-racially discriminative features?
Re-evaluating the SOTA approaches on the KinRace dataset opens a question about the adaptability of models to new datasets with varied distributions. Future research could investigate optimal re-implementation guidelines for fair assessment when applying existing methods to new datasets.
It may be worth exploring the application of the proposed fairness-aware loss function and adversarial learning techniques to other domains where fairness is critical, such as credit scoring or predictive policing, to see if similar reductions in bias can be achieved.
Since the authors highlight the potential limitations of their method when employed on small datasets, it would be valuable to explore strategies that can enhance the performance and fairness in limited-data scenarios.
The reduction of race bias in models poses the question of whether similar mechanisms could be designed to mitigate other forms of biases, like age or gender biases, in datasets where corresponding labels might be unavailable or unreliable.
This research presents pivotal advancements in kinship verification accuracy and racial fairness, paving the way for more inclusive and ethically conscious AI models in facial recognition technologies.
The reduction of race bias in models poses the question of whether similar mechanisms could be designed to mitigate other forms of biases, like age or gender biases, in datasets where corresponding labels might be unavailable or unreliable.
Essa questão, bem como o conteúdo anterior, foi gerado pelo GPT4 usando as minhas anotações. É bem pertinente ao que já estamos fazendo.
Esse paper foi bem complexo. Foram cerca de 12h estudando seu conteúdo e às vezes conceitos ou paper citados. Preciso ser mais eficiente nos demais.
Em grande parte, esse trabalho foi uma combinação dos seguintes trabalhos abaixo
Penso que nossos próximos passos devem ser com essa questão em mente. Nesse sentido, que trabalhos existem que foquem na remoção de viéses de gênero e idade? #41 foi um; há também #34.
Contrastive loss inspired by Supervised Contrastive Learning for Facial Kinship Recognition (2021)
- I think they build mostly upon this work -- network structure and hyperparameters.
Confirmo. O código deles foi adaptado do #26. Também citam explicitamente.
Encontrei-o enquanto procurava código para #49.