LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
https://open-assistant.io
Apache License 2.0
37.04k stars 3.23k forks source link

using Linguistic Stenography in Open Assistant. #792

Closed SummerSigh closed 11 months ago

SummerSigh commented 1 year ago

This issue pertains to the ideas around watermarking Open Assistant generations via Linguistic Stenography. Linguistic Stenography is a field of active research however a few methods have emerged that make me confident that it can be used effectively in identifying Open Assistant generations. This issue isn't going to detail these methods but rather discuss the ethical debate on the side of using Linguistic Stenography.

In my opinion here are the main benefits:

  1. Preventing plagiarism: Watermarking can help prevent others from using the generated text without proper attribution, which is especially important in academic and professional settings.
  2. Identifying false information: Watermarking can also be used to identify fake or misleading information that is generated by Open Assistant, which can help combat misinformation and disinformation.

However I also have the following concerns:

  1. Limited applicability: Watermarking may not be effective in certain contexts, such as when the text is heavily edited or translated.
  2. Decreasing efficiency: Watermarking can add an extra step to the text generation process, which can slow down the process and make it less efficient.
  3. Complicating collaboration: Watermarking can make it more difficult for multiple users to work together on a project. For example, if part of program is written using Open Assistant, it becomes more difficult to detect which parts of the code were generated by the model.

I think this is a subject of great discussion, and requires multiple view points so we can properly evaluate if this is something we should pursue further.

huu4ontocord commented 1 year ago

Thank you for brining this important issue up @SummerSigh.

This is a hard issue and I admit I haven't thought of this very much. My main focus is to have people use our tool to enable them to do better work and help people learn.

From the educaiton perspective, I think cheating in school essays could be a problem. However, I don't know if we should build water marking upfront, or put in the API hooks to enable, say a school open source community to plug in such a tool in the generator. So those communities can decide for themselves.

As for false information, I know this is a real problem. But the danger is whatever solution we might want to devise to limit fake news by tracking (with a switch turned on by default for example, and people can turn it off if they are researching this kinda stuff), I also don't want to propogate tracking technologies. I don't want orgs or governments to be able to track people unless they follow the process set up in their jursidictions.

Another solution to limit fake news is to get our models to be more factual, which we are trying to do. But we also don't want to limit people from creating fiction for example. An altnerative history where Hitler won the war, and everyone in America lives under Nazi rule (See e.g., The Man in High Castle), would have news articles that are very different in tone than our current news articles, for example. I can imagine people could and should be able to use the OA to generate such fiction. Could that be used for fake news?

I look forward to the discussion here. Also, in our discussion, we should propose solutions that are actually implementable with tech we have now or can reasonably create. For example, I think watermarking could be done via some sort of decoder skewing that might not change the semantics much. But we would need to research this.

SummerSigh commented 1 year ago

@ontocord here are some implementations I have tested and know work:

https://github.com/falcondai/lm-steganography https://github.com/ku-nlp/steganography-with-masked-lm https://github.com/mickeysjm/StegaText https://github.com/jumon/himitsu

I personally like steganography-with-masked-lm so I put it up in this Kaggle notebook:

https://www.kaggle.com/code/summerbreeze11/text-steganograpthy/edit

huu4ontocord commented 1 year ago

Thank you for sharing these links!

huu4ontocord commented 1 year ago

Following up on this if anyone is interested in exploring this more? Maybe create an API hook into the safety pipeline to steer decoding? And any org can put their own method/callback in there. This could be useful for privately run bots.

SummerSigh commented 1 year ago

I’ll take a look at it further. I’ll put my updates here as I go.