New submissions for Fri, 14 Oct 22

Keyword: out of distribution detection

There is no result

Keyword: out-of-distribution detection

Large-Scale Open-Set Classification Protocols for ImageNet

Authors: Jesus Andres Palechor Anacona, Annesha Bhoumik, Manuel Günther
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2210.06789
Pdf link: https://arxiv.org/pdf/2210.06789
Abstract Open-Set Classification (OSC) intends to adapt closed-set classification models to real-world scenarios, where the classifier must correctly label samples of known classes while rejecting previously unseen unknown samples. Only recently, research started to investigate on algorithms that are able to handle these unknown samples correctly. Some of these approaches address OSC by including into the training set negative samples that a classifier learns to reject, expecting that these data increase the robustness of the classifier on unknown classes. Most of these approaches are evaluated on small-scale and low-resolution image datasets like MNIST, SVHN or CIFAR, which makes it difficult to assess their applicability to the real world, and to compare them among each other. We propose three open-set protocols that provide rich datasets of natural images with different levels of similarity between known and unknown classes. The protocols consist of subsets of ImageNet classes selected to provide training and testing data closer to real-world scenarios. Additionally, we propose a new validation metric that can be employed to assess whether the training of deep learning models addresses both the classification of known samples and the rejection of unknown samples. We use the protocols to compare the performance of two baseline open-set algorithms to the standard SoftMax baseline and find that the algorithms work well on negative samples that have been seen during training, and partially on out-of-distribution detection tasks, but drop performance in the presence of samples from previously unseen unknown classes.
Keyword: expected calibration error

There is no result

Keyword: overconfident

Can Calibration Improve Sample Prioritization?
Authors: Ganesh Tata, Gautham Krishna Gudur, Gopinath Chennupati, Mohammad Emtiyaz Khan
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2210.06592
Pdf link: https://arxiv.org/pdf/2210.06592
Abstract Calibration can reduce overconfident predictions of deep neural networks, but can calibration also accelerate training by selecting the right samples? In this paper, we show that it can. We study the effect of popular calibration techniques in selecting better subsets of samples during training (also called sample prioritization) and observe that calibration can improve the quality of subsets, reduce the number of examples per epoch (by at least 70%), and can thereby speed up the overall training process. We further study the effect of using calibrated pre-trained models coupled with calibration during training to guide sample prioritization, which again seems to improve the quality of samples selected.
Keyword: overconfidence

There is no result

Keyword: confidence

Improving the Reliability for Confidence Estimation
Authors: Haoxuan Qu, Yanchao Li, Lin Geng Foo, Jason Kuen, Jiuxiang Gu, Jun Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2210.06776
Pdf link: https://arxiv.org/pdf/2210.06776
Abstract Confidence estimation, a task that aims to evaluate the trustworthiness of the model's prediction output during deployment, has received lots of research attention recently, due to its importance for the safe deployment of deep models. Previous works have outlined two important qualities that a reliable confidence estimation model should possess, i.e., the ability to perform well under label imbalance and the ability to handle various out-of-distribution data inputs. In this work, we propose a meta-learning framework that can simultaneously improve upon both qualities in a confidence estimation model. Specifically, we first construct virtual training and testing sets with some intentionally designed distribution differences between them. Our framework then uses the constructed sets to train the confidence estimation model through a virtual training and testing scheme leading it to learn knowledge that generalizes to diverse distributions. We show the effectiveness of our framework on both monocular depth estimation and image classification.
On the calibration of underrepresented classes in LiDAR-based semantic segmentation
Authors: Mariella Dreissig, Florian Piewak, Joschka Boedecker
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2210.06811
Pdf link: https://arxiv.org/pdf/2210.06811
Abstract The calibration of deep learning-based perception models plays a crucial role in their reliability. Our work focuses on a class-wise evaluation of several model's confidence performance for LiDAR-based semantic segmentation with the aim of providing insights into the calibration of underrepresented classes. Those classes often include VRUs and are thus of particular interest for safety reasons. With the help of a metric based on sparsification curves we compare the calibration abilities of three semantic segmentation models with different architectural concepts, each in a in deterministic and a probabilistic version. By identifying and describing the dependency between the predictive performance of a class and the respective calibration quality we aim to facilitate the model selection and refinement for safety-critical applications.
Utilizing supervised models to infer consensus labels and their quality from data with multiple annotators
Authors: Hui Wen Goh, Ulyana Tkachenko, Jonas Mueller
Subjects: Machine Learning (cs.LG); Human-Computer Interaction (cs.HC); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2210.06812
Pdf link: https://arxiv.org/pdf/2210.06812
Abstract Real-world data for classification is often labeled by multiple annotators. For analyzing such data, we introduce CROWDLAB, a straightforward approach to estimate: (1) A consensus label for each example that aggregates the individual annotations (more accurately than aggregation via majority-vote or other algorithms used in crowdsourcing); (2) A confidence score for how likely each consensus label is correct (via well-calibrated estimates that account for the number of annotations for each example and their agreement, prediction-confidence from a trained classifier, and trustworthiness of each annotator vs. the classifier); (3) A rating for each annotator quantifying the overall correctness of their labels. While many algorithms have been proposed to estimate related quantities in crowdsourcing, these often rely on sophisticated generative models with iterative inference schemes, whereas CROWDLAB is based on simple weighted ensembling. Many algorithms also rely solely on annotator statistics, ignoring the features of the examples from which the annotations derive. CROWDLAB in contrast utilizes any classifier model trained on these features, which can generalize between examples with similar features. In evaluations on real-world multi-annotator image data, our proposed method provides superior estimates for (1)-(3) than many alternative algorithms.
An Open-World Lottery Ticket for Out-of-Domain Intent Classification
Authors: Yunhua Zhou, Peiju Liu, Yuxin Wang, Xipeng Qiu
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2210.07071
Pdf link: https://arxiv.org/pdf/2210.07071
Abstract Most existing methods of Out-of-Domain (OOD) intent classification, which rely on extensive auxiliary OOD corpora or specific training paradigms, are underdeveloped in the underlying principle that the models should have differentiated confidence in In- and Out-of-domain intent. In this work, we demonstrate that calibrated subnetworks can be uncovered by pruning the (poor-calibrated) overparameterized model. Calibrated confidence provided by the subnetwork can better distinguish In- and Out-of-domain. Furthermore, we theoretically bring new insights into why temperature scaling can differentiate In- and Out-of-Domain intent and empirically extend the Lottery Ticket Hypothesis to the open-world setting. Extensive experiments on three real-world datasets demonstrate our approach can establish consistent improvements compared with a suite of competitive baselines.
FARE: Provably Fair Representation Learning
Authors: Nikola Jovanović, Mislav Balunović, Dimitar I. Dimitrov, Martin Vechev
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Arxiv link: https://arxiv.org/abs/2210.07213
Pdf link: https://arxiv.org/pdf/2210.07213
Abstract Fair representation learning (FRL) is a popular class of methods aiming to produce fair classifiers via data preprocessing. However, recent work has shown that prior methods achieve worse accuracy-fairness tradeoffs than originally suggested by their results. This dictates the need for FRL methods that provide provable upper bounds on unfairness of any downstream classifier, a challenge yet unsolved. In this work we address this challenge and propose Fairness with Restricted Encoders (FARE), the first FRL method with provable fairness guarantees. Our key insight is that restricting the representation space of the encoder enables us to derive suitable fairness guarantees, while allowing empirical accuracy-fairness tradeoffs comparable to prior work. FARE instantiates this idea with a tree-based encoder, a choice motivated by inherent advantages of decision trees when applied in our setting. Crucially, we develop and apply a practical statistical procedure that computes a high-confidence upper bound on the unfairness of any downstream classifier. In our experimental evaluation on several datasets and settings we demonstrate that FARE produces tight upper bounds, often comparable with empirical results of prior methods, which establishes the practical value of our approach.
Keyword: scaling

Evaluated CMI Bounds for Meta Learning: Tightness and Expressiveness
Authors: Fredrik Hellström, Giuseppe Durisi
Subjects: Machine Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2210.06511
Pdf link: https://arxiv.org/pdf/2210.06511
Abstract Recent work has established that the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020) is expressive enough to capture generalization guarantees in terms of algorithmic stability, VC dimension, and related complexity measures for conventional learning (Harutyunyan et al., 2021, Haghifam et al., 2021). Hence, it provides a unified method for establishing generalization bounds. In meta learning, there has so far been a divide between information-theoretic results and results from classical learning theory. In this work, we take a first step toward bridging this divide. Specifically, we present novel generalization bounds for meta learning in terms of the evaluated CMI (e-CMI). To demonstrate the expressiveness of the e-CMI framework, we apply our bounds to a representation learning setting, with $n$ samples from $\hat n$ tasks parameterized by functions of the form $f_i \circ h$. Here, each $f_i \in \mathcal F$ is a task-specific function, and $h \in \mathcal H$ is the shared representation. For this setup, we show that the e-CMI framework yields a bound that scales as $\sqrt{ \mathcal C(\mathcal H)/(n\hat n) + \mathcal C(\mathcal F)/n} $, where $\mathcal C(\cdot)$ denotes a complexity measure of the hypothesis class. This scaling behavior coincides with the one reported in Tripuraneni et al. (2020) using Gaussian complexity.
Parallel photonic accelerator for decision making using optical spatiotemporal chaos
Authors: Kensei Morijiri, Kento Takehana, Takatomo Mihana, Kazutaka Kanno, Makoto Naruse, Atsushi Uchida
Subjects: Emerging Technologies (cs.ET); Machine Learning (cs.LG); Optics (physics.optics)
Arxiv link: https://arxiv.org/abs/2210.06976
Pdf link: https://arxiv.org/pdf/2210.06976
Abstract Photonic accelerators have attracted increasing attention in artificial intelligence applications. The multi-armed bandit problem is a fundamental problem of decision making using reinforcement learning. However, the scalability of photonic decision making has not yet been demonstrated in experiments, owing to technical difficulties in physical realization. We propose a parallel photonic decision-making system for solving large-scale multi-armed bandit problems using optical spatiotemporal chaos. We solve a 512-armed bandit problem online, which is much larger than previous experiments by two orders of magnitude. The scaling property for correct decision making is examined as a function of the number of slot machines, evaluated as an exponent of 0.86. This exponent is smaller than that in previous work, indicating the superiority of the proposed parallel principle. This experimental demonstration facilitates photonic decision making to solve large-scale multi-armed bandit problems for future photonic accelerators.
An Open-World Lottery Ticket for Out-of-Domain Intent Classification
Authors: Yunhua Zhou, Peiju Liu, Yuxin Wang, Xipeng Qiu
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2210.07071
Pdf link: https://arxiv.org/pdf/2210.07071
Abstract Most existing methods of Out-of-Domain (OOD) intent classification, which rely on extensive auxiliary OOD corpora or specific training paradigms, are underdeveloped in the underlying principle that the models should have differentiated confidence in In- and Out-of-domain intent. In this work, we demonstrate that calibrated subnetworks can be uncovered by pruning the (poor-calibrated) overparameterized model. Calibrated confidence provided by the subnetwork can better distinguish In- and Out-of-domain. Furthermore, we theoretically bring new insights into why temperature scaling can differentiate In- and Out-of-Domain intent and empirically extend the Lottery Ticket Hypothesis to the open-world setting. Extensive experiments on three real-world datasets demonstrate our approach can establish consistent improvements compared with a suite of competitive baselines.
Scalable Multi-robot Motion Planning for Congested Environments Using Topological Guidance
Authors: Courtney McBeth, James Motes, Diane Uwacu, Marco Morales, Nancy M. Amato
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Arxiv link: https://arxiv.org/abs/2210.07141
Pdf link: https://arxiv.org/pdf/2210.07141
Abstract Multi-robot motion planning (MRMP) is the problem of finding collision-free paths for a set of robots in a continuous state space. The difficulty of MRMP increases with the number of robots due to the increased potential for collisions between robots. This problem is exacerbated in environments with narrow passages that robots must pass through, like warehouses. In single-robot settings, topology-guided motion planning methods have shown increased performance in these constricted environments. We adapt an existing topology-guided single-robot motion planning method to the multi-robot domain, introducing topological guidance for the composite space. We demonstrate our method's ability to efficiently plan paths in complex environments with many narrow passages, scaling to robot teams of size up to five times larger than existing methods in this class of problems. By leveraging knowledge of the topology of the environment, we also find higher quality solutions than other methods.
Exploring Long-Sequence Masked Autoencoders
Authors: Ronghang Hu, Shoubhik Debnath, Saining Xie, Xinlei Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2210.07224
Pdf link: https://arxiv.org/pdf/2210.07224
Abstract Masked Autoencoding (MAE) has emerged as an effective approach for pre-training representations across multiple domains. In contrast to discrete tokens in natural languages, the input for image MAE is continuous and subject to additional specifications. We systematically study each input specification during the pre-training stage, and find sequence length is a key axis that further scales MAE. Our study leads to a long-sequence version of MAE with minimal changes to the original recipe, by just decoupling the mask size from the patch size. For object detection and semantic segmentation, our long-sequence MAE shows consistent gains across all the experimental setups without extra computation cost during the transfer. While long-sequence pre-training is discerned most beneficial for detection and segmentation, we also achieve strong results on ImageNet-1K classification by keeping a standard image size and only increasing the sequence length. We hope our findings can provide new insights and avenues for scaling in computer vision.
Keyword: calibration

Can Calibration Improve Sample Prioritization?
Authors: Ganesh Tata, Gautham Krishna Gudur, Gopinath Chennupati, Mohammad Emtiyaz Khan
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2210.06592
Pdf link: https://arxiv.org/pdf/2210.06592
Abstract Calibration can reduce overconfident predictions of deep neural networks, but can calibration also accelerate training by selecting the right samples? In this paper, we show that it can. We study the effect of popular calibration techniques in selecting better subsets of samples during training (also called sample prioritization) and observe that calibration can improve the quality of subsets, reduce the number of examples per epoch (by at least 70%), and can thereby speed up the overall training process. We further study the effect of using calibrated pre-trained models coupled with calibration during training to guide sample prioritization, which again seems to improve the quality of samples selected.
On the calibration of underrepresented classes in LiDAR-based semantic segmentation
Authors: Mariella Dreissig, Florian Piewak, Joschka Boedecker
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2210.06811
Pdf link: https://arxiv.org/pdf/2210.06811
Abstract The calibration of deep learning-based perception models plays a crucial role in their reliability. Our work focuses on a class-wise evaluation of several model's confidence performance for LiDAR-based semantic segmentation with the aim of providing insights into the calibration of underrepresented classes. Those classes often include VRUs and are thus of particular interest for safety reasons. With the help of a metric based on sparsification curves we compare the calibration abilities of three semantic segmentation models with different architectural concepts, each in a in deterministic and a probabilistic version. By identifying and describing the dependency between the predictive performance of a class and the respective calibration quality we aim to facilitate the model selection and refinement for safety-critical applications.
ALIFE: Adaptive Logit Regularizer and Feature Replay for Incremental Semantic Segmentation
Authors: Youngmin Oh, Donghyeon Baek, Bumsub Ham
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2210.06816
Pdf link: https://arxiv.org/pdf/2210.06816
Abstract We address the problem of incremental semantic segmentation (ISS) recognizing novel object/stuff categories continually without forgetting previous ones that have been learned. The catastrophic forgetting problem is particularly severe in ISS, since pixel-level ground-truth labels are available only for the novel categories at training time. To address the problem, regularization-based methods exploit probability calibration techniques to learn semantic information from unlabeled pixels. While such techniques are effective, there is still a lack of theoretical understanding of them. Replay-based methods propose to memorize a small set of images for previous categories. They achieve state-of-the-art performance at the cost of large memory footprint. We propose in this paper a novel ISS method, dubbed ALIFE, that provides a better compromise between accuracy and efficiency. To this end, we first show an in-depth analysis on the calibration techniques to better understand the effects on ISS. Based on this, we then introduce an adaptive logit regularizer (ALI) that enables our model to better learn new categories, while retaining knowledge for previous ones. We also present a feature replay scheme that memorizes features, instead of images directly, in order to reduce memory requirements significantly. Since a feature extractor is changed continually, memorized features should also be updated at every incremental stage. To handle this, we introduce category-specific rotation matrices updating the features for each category separately. We demonstrate the effectiveness of our approach with extensive experiments on standard ISS benchmarks, and show that our method achieves a better trade-off in terms of accuracy and efficiency.
SageMix: Saliency-Guided Mixup for Point Clouds
Authors: Sanghyeok Lee, Minkyu Jeon, Injae Kim, Yunyang Xiong, Hyunwoo J. Kim
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2210.06944
Pdf link: https://arxiv.org/pdf/2210.06944
Abstract Data augmentation is key to improving the generalization ability of deep learning models. Mixup is a simple and widely-used data augmentation technique that has proven effective in alleviating the problems of overfitting and data scarcity. Also, recent studies of saliency-aware Mixup in the image domain show that preserving discriminative parts is beneficial to improving the generalization performance. However, these Mixup-based data augmentations are underexplored in 3D vision, especially in point clouds. In this paper, we propose SageMix, a saliency-guided Mixup for point clouds to preserve salient local structures. Specifically, we extract salient regions from two point clouds and smoothly combine them into one continuous shape. With a simple sequential sampling by re-weighted saliency scores, SageMix preserves the local structure of salient regions. Extensive experiments demonstrate that the proposed method consistently outperforms existing Mixup methods in various benchmark point cloud datasets. With PointNet++, our method achieves an accuracy gain of 2.6% and 4.0% over standard training in 3D Warehouse dataset (MN40) and ScanObjectNN, respectively. In addition to generalization performance, SageMix improves robustness and uncertainty calibration. Moreover, when adopting our method to various tasks including part segmentation and standard 2D image classification, our method achieves competitive performance.
Agent-Based Modelling for Urban Analytics: State of the Art and Challenges
Authors: Nick Malleson, Mark Birkin, Daniel Birks, Jiaqi Ge, Alison Heppenstall, Ed Manley, Josie McCulloch, Patricia Ternes
Subjects: Multiagent Systems (cs.MA)
Arxiv link: https://arxiv.org/abs/2210.06955
Pdf link: https://arxiv.org/pdf/2210.06955
Abstract Agent-based modelling (ABM) is a facet of wider Multi-Agent Systems (MAS) research that explores the collective behaviour of individual `agents', and the implications that their behaviour and interactions have for wider systemic behaviour. The method has been shown to hold considerable value in exploring and understanding human societies, but is still largely confined to use in academia. This is particularly evident in the field of Urban Analytics; one that is characterised by the use of new forms of data in combination with computational approaches to gain insight into urban processes. In Urban Analytics, ABM is gaining popularity as a valuable method for understanding the low-level interactions that ultimately drive cities, but as yet is rarely used by stakeholders (planners, governments, etc.) to address real policy problems. This paper presents the state-of-the-art in the application of ABM at the interface of MAS and Urban Analytics by a group of ABM researchers who are affiliated with the Urban Analytics programme of the Alan Turing Institute in London (UK). It addresses issues around modelling behaviour, the use of new forms of data, the calibration of models under high uncertainty, real-time modelling, the use of AI techniques, large-scale models, and the implications for modelling policy. The discussion also contextualises current research in wider debates around Data Science, Artificial Intelligence, and MAS more broadly.
Two approaches to inpainting microstructure with deep convolutional generative adversarial networks
Authors: Isaac Squires, Samuel J. Cooper, Amir Dahari, Steve Kench
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Arxiv link: https://arxiv.org/abs/2210.06997
Pdf link: https://arxiv.org/pdf/2210.06997
Abstract Imaging is critical to the characterisation of materials. However, even with careful sample preparation and microscope calibration, imaging techniques are often prone to defects and unwanted artefacts. This is particularly problematic for applications where the micrograph is to be used for simulation or feature analysis, as defects are likely to lead to inaccurate results. Microstructural inpainting is a method to alleviate this problem by replacing occluded regions with synthetic microstructure with matching boundaries. In this paper we introduce two methods that use generative adversarial networks to generate contiguous inpainted regions of arbitrary shape and size by learning the microstructural distribution from the unoccluded data. We find that one benefits from high speed and simplicity, whilst the other gives smoother boundaries at the inpainting border. We also outline the development of a graphical user interface that allows users to utilise these machine learning methods in a 'no-code' environment.
Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations
Authors: Nelson Vadori, Leo Ardon, Sumitra Ganesh, Thomas Spooner, Selim Amrouni, Jared Vann, Mengda Xu, Zeyu Zheng, Tucker Balch, Manuela Veloso
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Computational Finance (q-fin.CP)
Arxiv link: https://arxiv.org/abs/2210.07184
Pdf link: https://arxiv.org/pdf/2210.07184
Abstract We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with associated shared policy learning constitutes an efficient solution to this problem. Precisely, we show that our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of incentives encompassing profit-and-loss, optimal execution and market share, by playing against each other. In particular, we find that liquidity providers naturally learn to balance hedging and skewing as a function of their incentives, where the latter refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm which we found performed well at imposing constraints on the game equilibrium, both on toy and real market data.

ericbeyer / L-arxiv-interest-tracker

New submissions for Fri, 14 Oct 22 #660

Keyword: out of distribution detection

Keyword: out-of-distribution detection

Large-Scale Open-Set Classification Protocols for ImageNet

Keyword: expected calibration error

Keyword: overconfident

Can Calibration Improve Sample Prioritization?

Keyword: overconfidence

Keyword: confidence

Improving the Reliability for Confidence Estimation

On the calibration of underrepresented classes in LiDAR-based semantic segmentation

Utilizing supervised models to infer consensus labels and their quality from data with multiple annotators

An Open-World Lottery Ticket for Out-of-Domain Intent Classification

FARE: Provably Fair Representation Learning

Keyword: scaling

Evaluated CMI Bounds for Meta Learning: Tightness and Expressiveness

Parallel photonic accelerator for decision making using optical spatiotemporal chaos

An Open-World Lottery Ticket for Out-of-Domain Intent Classification

Scalable Multi-robot Motion Planning for Congested Environments Using Topological Guidance

Exploring Long-Sequence Masked Autoencoders

Keyword: calibration

Can Calibration Improve Sample Prioritization?

On the calibration of underrepresented classes in LiDAR-based semantic segmentation

ALIFE: Adaptive Logit Regularizer and Feature Replay for Incremental Semantic Segmentation

SageMix: Saliency-Guided Mixup for Point Clouds

Agent-Based Modelling for Urban Analytics: State of the Art and Challenges

Two approaches to inpainting microstructure with deep convolutional generative adversarial networks

Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations