The following 2 cases can't be detected

codelive commented 3 months ago

AAC 64kbps encoding.
OBS recording audio output, audio enhancement is enabled by default on windows 11 system.

antoine-tran commented 3 months ago

Hi @codelive , could you elaborate more on this ?

codelive commented 3 months ago

Hi @antoine-tran,

Here's my test code:

from audioseal import AudioSeal

import torch
import torchaudio
def watermark_embed():
    model = AudioSeal.load_generator("audioseal_wm_16bits")
    audio, sample_rate = torchaudio.load("input.wav")
    audios = audio.unsqueeze(0)
    bits = [[1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0]]
    secret_mesage = torch.tensor(bits, dtype=torch.int32)
    print(f"bits: {secret_mesage}")
    watermarked = model(audios, sample_rate=sample_rate, message=secret_mesage, alpha=1)
    watermarked_audio = watermarked.detach()
    torchaudio.save("output_seal.wav", src=watermarked_audio[0], sample_rate=sample_rate)

def watermark_detect():
    audio, sample_rate = torchaudio.load("output_seal.wav")
    audios = audio.unsqueeze(0)
    detector = AudioSeal.load_detector(("audioseal_detector_16bits"))
    result, message = detector.detect_watermark(audios, sample_rate=sample_rate, message_threshold=0.5)
    print(f"bits: {message}, score: {result}")

# watermark_embed()
watermark_detect()

First call watermark_embed() to save a watermarked audio file "output_seal.wav" bits: tensor([[1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0]], dtype=torch.int32)
The second step calls watermark_detect() to detect a watermark bits: tensor([[1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0]], dtype=torch.int32), score: 1.0
Call ffmpeg to transcode the watermarked file to aac at 64kbps, and then convert the aac to wav.
```
ffmpeg -y -i output_seal.wav -b:a 64k output_seal.aac
ffmpeg -y -i output_seal.aac output_seal.wav
```
bits: tensor([[0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]], dtype=torch.int32), score: 0.6068740487098694 The watermark detection score is about 0.6, which does not match the original watermark.
Record the system's sound output using OBS and then detect it. bits: tensor([[1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1]], dtype=torch.int32), score: 0.202103391289711 Screenshots for obs recording and system audio settings:
I turned off auto enhancement and then recorded again with obs. The detection score is about 0.9, but it still doesn't match my original watermark. bits: tensor([[1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1]], dtype=torch.int32), score: 0.9000480771064758

The wav file I used: input.zip

Thank you very much for your reply.

pierrefdz commented 1 month ago

Thanks for raising this up! I don't see any way of solving this without fine-tuning the model with these augmentations... Tell us if you've found anything.

facebookresearch / audioseal

The following 2 cases can't be detected #50