MultiX-Amsterdam / ijmond-camera-monitor

https://ijmondcam.multix.io
MIT License
2 stars 1 forks source link

Prepare interview transcription #7

Closed yenchiah closed 3 months ago

yenchiah commented 6 months ago

The transcription solution has to be GDPR compliant.

Do research on how to transcribe Dutch interviews, for example, below are some potential solutions:

Below is the experience from the other PhD student who did the Dutch transcription before:

kingilsildor commented 6 months ago

AWS transcribe ✔️

AWS customers can use all AWS services to process personal data (as defined in the GDPR) that is uploaded to the AWS services under their AWS accounts (customer data) in compliance with the GDPR. https://aws.amazon.com/compliance/gdpr-center/

kingilsildor commented 3 months ago

OpenAI Whisper 〰️

If you are running it locally, you aren't sharing any data. If you use their API you can opt out of data sharing. https://github.com/openai/whisper/discussions/1462

How is this GDPR compliant you might ask, you will process and transcribe locally on your machine, meaning none of the data is sent anywhere. However, it requires us to download the entire model onto our computers. https://www.chiangs.dev/blog/gdpr-compliant-transcription-with-ai

Personal data is collected https://openai.com/policies/privacy-policy/

kingilsildor commented 3 months ago

Notta ✔️

Compliant with the General Data Protection Regulation (GDPR) to properly handle personal data of EU citizens and protect its privacy. https://www.notta.ai/en/security

Foremost, the software complies with all essential security regulations. This includes compliance with GDPR, CCPA, HIPAA, and SOC 2. https://www.softwaretestinghelp.com/notta-review/

kingilsildor commented 3 months ago

Teams ➖

There isn't much information to find about it. But other people noting some privacy concerens https://www.charityconnect.co.uk/post/gdpr-implications-of-meeting-transcriptions/13691

Microsoft collects data from you, through our interactions with you and through our products. You provide some of this data directly, and we get some of it by collecting data about your interactions, use, and experiences with our products. The data we collect depends on the context of your interactions with Microsoft and the choices you make, including your privacy settings and the products and features you use. We also obtain data about you from third parties. https://privacy.microsoft.com/en-gb/privacystatement

Always be mindful of what personal data is shared or written in conversations on Teams Chat, like any other platform it may be subject to disclosure and therefore the content should remain professional. https://www.staffnet.manchester.ac.uk/news/display/?id=27905

kingilsildor commented 3 months ago

Amberscript ✔️

Amberscript develops technology that automatically transforms human speech into text. This technology can be accessed via the Amberscript editor (app.www.amberscript.com) and when using this service, your data is only visible to yourself. No other people are involved or have access to your audio, video, or transcripts. To use our services, you upload media files such as audio or video. We use uploaded audio and user feedback to generate anonymized training data for speech-to-text engines. We do this to improve the speech-to-text engine and to improve the user experience by offering a continuously improving service. The audio and transcripts are only processed automatically, encrypted, and anonymized. You can indicate at any time that your audio and transcripts should not be used to improve the speech-to-text engine and your user experience by sending an email to info@amberscript.com. Specifically, Google Workspace APIs (and files downloaded from Google Workspace APIs) are not used to develop, improve, or train generalized AI and/or ML models. https://www.amberscript.com/en/privacy-policy/

1) We are fully compliant with GDPR policies We implemented all the security measures to protect, store and handle your data. Your data is always stored in Western Europe. https://www.amberscript.com/en/privacy-policy/

kingilsildor commented 3 months ago

AWS transcribe ✔️ Amberscript ✔️ Notta ✔️

OpenAI Whisper 〰️

Teams ➖

@yenchiah