icerpc / icerpc-csharp

A C# RPC framework built for QUIC, with bidirectional streaming, first-class async/await, and Protobuf support.
https://docs.icerpc.dev
Apache License 2.0
102 stars 13 forks source link

Apostrophes in documentation comments are improperly escaped #3984

Closed ReeceHumphreys closed 4 months ago

ReeceHumphreys commented 4 months ago

Description

Documentation comments for slice that contain apostrophes result in generated C# documentation comments with ' instead of an apostrophe.

Reproduction steps

Example Slice:

foo.slice

module Foo

/// Bar's doc
interface Bar {}

Generated code snippet:

foo.IceRpc.cs

...

/// <summary>Bar&apos;s doc</summary>
/// <remarks>The Slice compiler generated this client-side interface from Slice interface <c>Foo::Bar</c>.
/// It's implemented by <see cref="BarProxy" />.</remarks>
public partial interface IBar
{
}

...

Expected behavior

I would expect the generated output comment to look like:

...
/// <summary>Bar's doc</summary>
...

Actual behavior

The apostrophe is being escaped.

...
/// <summary>Bar&apos;s doc</summary>
...

Configuration

slicec-cs version: '0.4.0' (main branch) macOS 14.5 (23F79)

Additional information

No response

ReeceHumphreys commented 4 months ago

Actually, thinking about this a bit more, C# uses XML doc comments so the apostrophes should have escaping so this may not be a bug. Can someone confirm @pepone, @externl, @bernardnormier.

pepone commented 4 months ago

Yes, escaping is required in C# doc comments.

pepone commented 4 months ago

Actually ' doesn't need to be escaped in the summary

See for example:

https://docs.icerpc.dev/api/csharp/api/IceRpc.ClientConnection.html#IceRpc_ClientConnection__ctor_IceRpc_ServerAddress_System_Net_Security_SslClientAuthenticationOptions_IceRpc_Transports_IDuplexClientTransport_IceRpc_Transports_IMultiplexedClientTransport_Microsoft_Extensions_Logging_ILogger_

And source

https://github.com/icerpc/icerpc-csharp/blob/57995b702436169798fc3fe5fd513e6f09591397/src/IceRpc/ClientConnection.cs#L94

InsertCreativityHere commented 4 months ago

Actually ' doesn't need to be escaped in the summary

Does it need to be escaped anywhere?

https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/documentation-comments This page doesn't call them out as needing to be escaped, and it shows apostrophes appearing unescaped in <param> tags, <value> tags, <returns>, and <example> tags as well.

And to my knowledge, you don't need to escape single quotes in XML text content.

pepone commented 4 months ago

And to my knowledge, you don't need to escape single quotes in XML text content.

See: https://www.w3.org/TR/xml/#syntax

It seems that &apos; is only required for attributes delimited with single quotes '. For text elements, only the characters & and < need to be escaped.

InsertCreativityHere commented 4 months ago

So I guess the next thing to check is do we ever use ' to delimit attributes. I expect we probably don't, " feels more standard to me, and should accomplish the same thing.

InsertCreativityHere commented 4 months ago

I don't see anywhere in slicec-cs where we're outputting ' for attributes. So it's probably safe to just never escape it.

Should we make our escaping function smarter? To support some way of passing in a context so we don't overescape? Right now it's just:

fn xml_escape(text: &str) -> String {
    text.replace('&', "&amp;")
        .replace('<', "&lt;")
        .replace('>', "&gt;")
        .replace('\'', "&apos;")
        .replace('"', "&quot;")
}
bernardnormier commented 4 months ago

I think you just need to remove .replace('\'', "&apos;").

InsertCreativityHere commented 4 months ago

I agree that's definitely the easy fix.

I Just wanted to see if anyone wanted to advocate for adding a more complex fix, but that would only escape when absolutely necessary. I don't think it's worth it, but one could make an argument for it.