xp1632 / DFKI_working_log

0 stars 0 forks source link

Paper summary #1

Open xp1632 opened 1 year ago

xp1632 commented 1 year ago

https://arxiv.org/pdf/2202.08946.pdf

xp1632 commented 1 year ago

-Feature 1: Modular Components

-Js-based Components, flexiable
-Each components is passed three parameter:
    -meta-data table
    -derived state variables like grouped tables
    -reference to raw data instance like images

-Feature 2: Environment Wrapper

-Wrappers as backing platform to connect components
-programming environment wrappers for explorary analysis
-web-based UI for sharing insights

-Python Wrapper
-ipywidget API renders web-based widgets in the Jupyter notebook UI
and synchronizes its variables with the Python kernel
-Support for pandas dataframe

-Web-based dashboard wrappers
-the programming environment is exported as a statically host websites
-Components can be configured before export to fit in particular usecase

-Feature 3: Interaction Tools

-Each components has same interaction tools 
-And changes are synchronized between jupyter notebook and web-based

-In web-based, state changes are also saved in URL

-Three ways of using tools:
    -UI toolbar(in both jupyter and web) on Fig4 right, and Fig 2 left 
    as a sidebar(can filtering, grouping data)
    -Componnets itself
    -python code

Added support for unstructured data like audio -If the developer wants to visualize other types just implement a rendering function for new data which all components can use -For large dataset, add self-modified pagination

-All the components are built with component 'cookiecutter' template -Three methods of creating components: 1.JS and visualization package Vega and D3

  1. JS library REGL Scatterplot(WebGL library) for projection component
  2. Directly use Svelte components

Chapter 7: Case Study

-Stage 1 think loud during exploring data and model using Symphony

-Stage 2 export coding environment to web-based dashborad

-Stage 3 Ask for feedback

Case study 1: Validating Data pattern

Cons:
-sharable repot for dataset
-Similarity components, filtering tool bar
-synchronized chart

Limit:
-Introduction for complicated components
-More 2D graph such as heatmap

Case study 2:Debugging training data

Cons:
-similirity works

Limits:
-Also add model analysis 

Case study 3:

Cons:
-bring data to forefront

Limit:
-Link and unlink of different components and see state change
xp1632 commented 1 year ago

Summary of Chapter 1-4:

-Shortcoming for existing ML interfaces: not designed to be reused/explored and shared in cross-functional team

-Our framework can be used accross platform, from computational notebook to web dashboard

-they cited interactive programming widgets (e.g., ipywidgets [32], Streamlit [31]) that give practitioners insights into what their datasets contain and how their models behave, they also that these widgets are lack of usage of unstructured data such as images, audio

-Our framework Symphony has a code environment like Jupyter, and a non-code environment as web-based UIs


-Symphony can import/export the components to a self-contained,web-based UI -Data functions: -modular visualization for dataset -for unstructured data:image, a repackage of lots of other visual tools -Link the information by providing both coding environment and non-coding -Support ant JS-based visualization -Symphony Components could also be used in other framework

-Usecase for ML interfaces:
    -ML interfaces need to be flexiable to support needs from different stakeholders
    -lack of adequate tool leads to using one-off manual own tool 
    -other tools require specific format only for visualization
        -they don't want a tool that requires more than 5-10 mins only for visuals
    -Need visualization support for more data types(image/video) and models
    -Lack of communicatation with different stakeholders, hard to share insight