Different literature for testing and validation of DNN

deebuls commented 2 years ago

http://www.cs.toronto.edu/~chechik/courses19/csc2125/week5/DeepTest.pdf

This has a section which talks about why white-box testing (neuron coverage) is not enough and the limiations of it .
Also has the corner case image generation
This is an important thing to investigate as we want only to generate corner case dataset and not normal dataset
Google for corner case dataset for DNN and find papers related to it

https://ieeexplore.ieee.org/document/8809533 - Deep Validation: Toward Detecting Real-World Corner Cases for Deep Neural Networks

https://arxiv.org/pdf/2101.02494.pdf - Corner Case Data Description and Detection

@rashidahamedmeeran write a paragraph on your understanding of corner case dataset generation in proposal or a document and close this issue

deebuls commented 2 years ago

https://assured-autonomy.org/tools/verifai

Seems to be very relevant . This has a list of papers they are using .

Fremont, D. J., Kim, E., Pant, Y. V., Seshia, S. A., Acharya, A., Bruso, X., et al. (2020). Formal Scenario-Based Testing of Autonomous Vehicles: From Simulation to the Real World. Arxiv. Retrieved from https://arxiv.org/abs/2003.07739

@rashidahamedmeeran  write a paragraph from your readings of these 
papers what you understand about DNN verification and also using simulation
for verification in the proposal or a document . Then we can discuss on this
topic

deebuls commented 2 years ago

https://arxiv.org/pdf/2202.12139.pdf

Paper from https://icst2022.vrain.upv.es/program/program-icst-2022/

Its a IEEE conference on software testing (The Rnd you should plan to submit here as a paper). The paper is on "Testing Deep Learning Models: A First Comparative Study of Multiple Testing Techniques"

The interesting thing is they have no mention about the simulation based testing and just dataset augmentation based techniques. So there is good extension our work. @rashidahamedmeeran

deebuls commented 1 year ago

http://ceur-ws.org/Vol-2894/short4.pdf

Towards Accountability Driven Development for Machine Learning Systems ?

With rapid deployment of Machine Learning (ML) systems into diverse domains such as healthcare and autonomous driving, important questions regarding accountability in case of incidents resulting from ML errors remain largely unsolved. To improve accountability of ML systems, we introduce a framework called Accountability Driven Development (ADD). Our framework reuses Behaviour Driven Development (BDD) approach to describe testing scenarios and system behaviours in ML Systems’ development using natural language, guides and forces developers and intended users to actively record necessary accountability information in the design and implementation stages. In this paper, we illustrate how to transform accountability requirements to specific scenarios and provide syntax to describe them. The use of natural language allows non technical collaborators such as stakeholders and non ML domain experts deeply engaged in ML system development to provide more comprehensive evidence to support system’s accountability. This framework also attributes the responsibility to the whole project team including the intended users rather than putting all the accountability burden on ML engineers only. Moreover, this framework can be considered as a combination of both system test and acceptance test, thus making the development more efficient. We hope this work can attract more engineers to use our idea, which enables them to create more accountable ML systems.

deebuls commented 1 year ago

Operationalizing Machine Learning: An Interview Study

Shreya Shankar, Rolando Garcia, Joseph M. Hellerstein, Aditya G. Parameswaran

Download PDF

We conducted semi-structured ethnographic interviews with 18 MLEs working across many applications, including chatbots, autonomous vehicles, and finance. Our interviews expose three variables that govern success for a production ML deployment: Velocity, Validation, and Versioning. We summarize common practices for successful ML experimentation, deployment, and sustaining production performance. Finally, we discuss interviewees' pain points and anti-patterns, with implications for tool design.

From interviews we can check what do they require for validation.

deebuls commented 1 year ago

Requirements Engineering for Machine Learning: Perspectives from Data Scientists

They interviewed data scientist and concluded on the requirements engineering. In the conclusion section they have a comment made on the V&V section .

D. Verification & Validation Due to the dependency between the behavior of an ML system and the data it has been trained on, it is crucial to define actions that ensure that training data actually corresponds to real data. Since data characteristics in reality may change over time, requirements validation becomes an activity that needs to be performed continuously during system operation. Our interviewees agreed that monitoring and analysis of runtime data is essential for maintaining the performance of the ML system. They also agreed that ML systems need to be retrained regularly to adjust to recent data. By analyzing the problem domain, a requirements engineer should specify when and how often retraining is necessary. A requirements engineer should also specify conditions for data anomalies that may potentially lead to unreasonable behavior of the ML system during runtime. A checklist of measures to be considered during operations of ML systems is provided by Breck et al. [29]. Apart from runtime monitoring, requirements validation also includes analyzing the training and production data for bias and imbalances.

RnDProjectsDeebul / RashidTajdeenRnD

Different literature for testing and validation of DNN #10