Open codelive opened 3 months ago
Hi @codelive , could you elaborate more on this ?
Hi @antoine-tran,
Here's my test code:
from audioseal import AudioSeal
import torch
import torchaudio
def watermark_embed():
model = AudioSeal.load_generator("audioseal_wm_16bits")
audio, sample_rate = torchaudio.load("input.wav")
audios = audio.unsqueeze(0)
bits = [[1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0]]
secret_mesage = torch.tensor(bits, dtype=torch.int32)
print(f"bits: {secret_mesage}")
watermarked = model(audios, sample_rate=sample_rate, message=secret_mesage, alpha=1)
watermarked_audio = watermarked.detach()
torchaudio.save("output_seal.wav", src=watermarked_audio[0], sample_rate=sample_rate)
def watermark_detect():
audio, sample_rate = torchaudio.load("output_seal.wav")
audios = audio.unsqueeze(0)
detector = AudioSeal.load_detector(("audioseal_detector_16bits"))
result, message = detector.detect_watermark(audios, sample_rate=sample_rate, message_threshold=0.5)
print(f"bits: {message}, score: {result}")
# watermark_embed()
watermark_detect()
First call watermark_embed() to save a watermarked audio file "output_seal.wav"
bits: tensor([[1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0]], dtype=torch.int32)
The second step calls watermark_detect() to detect a watermark
bits: tensor([[1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0]], dtype=torch.int32), score: 1.0
Call ffmpeg to transcode the watermarked file to aac at 64kbps, and then convert the aac to wav.
ffmpeg -y -i output_seal.wav -b:a 64k output_seal.aac
ffmpeg -y -i output_seal.aac output_seal.wav
bits: tensor([[0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]], dtype=torch.int32), score: 0.6068740487098694
The watermark detection score is about 0.6, which does not match the original watermark.
Record the system's sound output using OBS and then detect it.
bits: tensor([[1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1]], dtype=torch.int32), score: 0.202103391289711
Screenshots for obs recording and system audio settings:
I turned off auto enhancement and then recorded again with obs. The detection score is about 0.9, but it still doesn't match my original watermark. bits: tensor([[1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1]], dtype=torch.int32), score: 0.9000480771064758
The wav file I used: input.zip
Thank you very much for your reply.
Thanks for raising this up! I don't see any way of solving this without fine-tuning the model with these augmentations... Tell us if you've found anything.