anngvu / accent

AI-Assisted Curation/Content ENhancement Tool for Synapse (pre-alpha)
MIT License
0 stars 1 forks source link

ACCENT

[!WARNING]
This is a prototype application. Development is still working mitigating risks of using generative AI. Our current target users are data professionals.

Motivation

For biomedical curators/data managers

Research communities supported by dedicated data curators/managers receive the benefit of having data packaged and disseminated optimally for reuse. Data managers themselves could benefit from tooling to facilitate their important and hard work of curating data, developing the data model, and facilitating data sharing in general. And like with other knowledge work, including AI could greatly boost productivity, though it is perhaps best achieved through an internal or "wrapper" interface that mitigate pitfalls^1.

Developers can also help with figuring out where AI can be inserted into workflows and how to design technology for doing that.

This is such an application. Some data management responsibilities^2 prioritized for an assisted workflow are:

  1. Data curation -- create, organize, QC, and publish FAIR/harmonized data assets to the best advantage.
  2. Develop standards and data models.
  3. Maintain data management plans and SOPs.
  4. Facilitate data analysis/reuse and reporting for stakeholders, regulatory authorities, etc.
  5. Oversee the integration of apps/new technologies and initiatives into data standards and structures.

Usage

With more power comes more responsibility. Unlike interacting with generative AI in the default web interface, the application infrastructure here includes prompts and logic already optimized to project-specific workflows, direct API access to relevant platforms (Synapse), the local file system, configured databases, and additional tools/agents to accomplish various tasks. This infrastructure will also need to include guardrails.

Until this is released as a .jar, you do need some Clojure tooling.

Configuration

Settings and (optionally) credentials can be defined in config.edn. Review the example_config.edn file; rename it to config.edn and modify as needed. In addition to the comments in the example file, more discussion is provided below.

AI Providers

The app integrates two providers, Anthropic and OpenAI, and an initial model provider must be specified. In the same chat, it is possible to switch between models from the same provider but not between different providers, e.g. switching from ChatGPT-3.5 to ChatGPT-4o, but not from ChatGTP-3.5 to Claude Sonnet-3.5. However, existence of the switching feature does not suggest that the user should be manually and frequently switching between models. For both providers, the default is to use a model on the smarter end, though later on it may be possible to specify an initial model in the config. Tip for usage: Trying to reduce costs by switching to a cheaper model for some tasks is likely premature optimization at this early stage.

Run your preferred UI and specialized curation agent

[!NOTE]
Currently, there are some tradeoffs between the terminal vs web UI. The web UI will have some features that the terminal will not, such as showing figures. On the other hand, web UI only works with OpenAI for now.

For Synapse curation

The web UI is highly recommended:

For the terminal:

Demos / tutorials for various scenarios

Dynamic Roadmap

This roadmap adapts to the feedback and interest received. Functionality have been scoped/mapped as below for specific versions. Feel free to propose a new feature or fast-tracking an existing one.

Nothing more is planned until after the Evaluation (below).

Evaluation

Feedback is currently being gathered with curators who are being trained for integrating this into their workflows. The comparisons will be between workflows that:

  1. Doesn't incorporate any LLM and does things manually, maybe with custom scripting, or with some other non-AI app.
  2. Incorporates generative AI but only via the default online chat interface.
  3. Incorporates generative AI through a different custom interface/solution.