Closed kouloumos closed 8 months ago
This is thorough and will help reviewers
I propose adding an example to "For code-related technical terms or math equations, enclose them in backticks (`) for clarity."
This should be in the tutorial (if it is, it should be more evident as I read it all). I just submitted a transcription where I removed the timestamps because I used an old transcript as a reference. See #375.
Also, there's peer review? I mean, as a contributor, may I review stuff or just submit edited transcripts? Or review is reserved for maintainers?
This should be in the tutorial (if it is, it should be more evident as I read it all). I just submitted a transcription where I removed the timestamps because I used an old transcript as a reference. See #375.
We are planning on adding this to the tutorial.
Also, there's peer review? I mean, as a contributor, may I review stuff or just submit edited transcripts? Or review is reserved for maintainers?
For now, we are reviewing each submission. But we are currently working on making reviewing easier, so in the future reviewing could potentially be a peer review process.
The review guidelines look good. I recommend having 3 or so examples of transcripts that fully meet your criteria as link to provide visual examples.
I also suggest using more invitational language, like "to make sure we meet a shared quality of transcripts, please make sure you meet the following criteria, xyz."
Where do you plan to surface these guidelines for users in their flow? Imagine someone logs in for the first time -
The current flow is designed for more people going thru the funnel and less churn. So a user could theoretically not read instructions and then be surprised with this mandatory pop up modal if they click submit:
Great work otherswise!
Review guidelines are now live, so hopefully, I will not need to point to this issue ever again.
Setting the tone for reviewers is crucial to emphasize their pivotal role in improving the original AI transcription. I'm currently exploring several ideas aimed at refining our onboarding flow to set appropriate expectations.
One such idea is to provide to the user clear Review Guidelines. Since going public, I've wrote different flavors of reviewing guidelines as feedback to users' submissions. The following is my attempt to generalize that feedback into clear guidelines for first-time reviewers.
Review Guidelines
Transcription Style:
Transcript Structure:
Maintain original "one-sentence-per-line" formatting and timestamps. (click for example)
**Example of formatting with multiple speakers** ```markdown ## Chapter Header Aaron van Wirdum: 00:12:14 I think these are the two most common amounts of words. Why is there a difference? Is one just more secure? Sjors Provoost: 00:12:18 Yeah, it's more bits. 12 words means 128 bits of random data. You're basically throwing 128 coins, heads or tails. ```Ensure coherent paragraphing around chapter titles and speaker timestamps.
Break text into paragraphs and ensure accurate punctuation for better readability.
Chapters:
Accuracy:
Identify and fix AI transcription errors, especially related to technical terms and Bitcoin-specific jargon.
Ensure code-related technical terms or math equations are enclosed in backticks (`) to enhance clarity. (click for example)
**Example with code-related technical terms** _"It is a scalar. A private key kept secret and used to sign. It is a scalar in the `secp256k1` group. In Bitcoin Core that is a class called `CKey`. That’s in `src/key.h`."_ **Example with math equation** _"But if you look at that you can see that actually that's `(e*x)*G = e*(x*G)`, and `x*G` is the public key. So that's actually `e*(x*G) = e*P`"_Speaker Attribution:
Final Review:
Quality assessment metrics for transcripts submissions
Having clear guidelines, also helps with quantitative quality assessment. I've came up with two sets of evaluation criteria to help with submissions evaluation and quality assessment.