Closed maetl closed 1 year ago
Draft specification, pulled from current docs. The only major change I’m thinking of making to this is defining separate modifiers for startcase
and titlecase
with the former brute-forcing all words to have an initial cap and the latter following AP styleguide or Chicago Manual of Style rules.
{expansion.uppercase}
Converts a template expansion to uppercase.
{expansion.lowercase}
Converts a template expansion to lowercase.
{expansion.titlecase}
Converts a template expansion to title case.
{expansion.sentencecase}
Converts a template expansion to sentence case.
I’m also considering having a pluralising and singularising inflector built-in—and maybe some other ‘small’ linguistics helpers (like indefinite article for nouns: a/an). But none of this is a requirement for now.
A Dictionary<string, Func<string, Options, string>>
mapping the lowercase
etc. identifiers used by the other language to anonymous functions which perform the required operation is demonstrated in my modifiers branch, with sample tests illustrating upper, lower, and Spongebob
I think a C# dictionary can be made to do everything the JS/Ruby languages can merely by using a dictionary since there doesn't seem to be a requirement that these are actually dynamic methods, only that they can be accessed somehow and potentially added at runtime.
For sure. JavaScript at its core is basically a dictionary data structure with inheritance via prototypes (pointers/references between dictionaries) so this is always going to be a viable way of getting some form of dynamic typing to work.
There’s no requirement for the methods to be dynamic, they will most likely be compile time things, but there might be some cases where people would want to extend grammars with custom modifiers to do specific things based on specific runtime state, though that is a bit of a hack/edge case for what this API is intended to do, and the recommended advice would be to think about the modifiers as stateless string formatting methods.
Here’s another approach, perhaps more ‘trad’ and I have no idea if it works exactly as assumed in standard C# (or in Unity). Also not sure whether reflection is going to incur some weird performance issues, but I do like the idea that the modifiers are declared as well-defined string formatting types rather than anonymous functions.
using System.Reflection;
namespace Calyx {
namespace Modifiers {
public static class Invocation
{
public static string Format(string modifierName, string input)
{
bool throwOnError = false;
bool ignoreCase = true; // This avoids need for symbol table
Type modifierClass = Assembly.GetType(
$"Calyx.Modifiers.{modifierName}",
throwOnError,
ignoreCase
);
var modifierMethod = modifierClass.GetMethod("Format");
Modifier modifierInstance = (Modifier)Activator.CreateInstance(modifierClass);
return modifierMethod.Invoke(modifierInstance, new object[] { input });
}
}
interface Modifier {
public string Format(string input);
}
class UpperCase implements Modifier {
public string Format(string input)
{
return input.ToUpper();
}
}
class LowerCase implements Modifier {
public string Format(string input)
{
return input.ToLower();
}
}
}
}
Here’s another way of doing it that punts all dictionary lookups to registration with a static class dict and everything else basically being hard-coded at compile time. There’s almost certainly a more clever/maintainable way to do this, but at least I know exactly what is going on when this code executes.
public static class Extensions {
private static readonly SymbolTable = new Dictionary<string, ModifierFunc>();
public static void Register(string modifier, ModifierFunc hook)
{
SymbolTable.Add(modifier, hook);
}
public static bool ModifierExists(string modifier)
{
return SymbolTable.HasKey(modifier);
}
public static string InvokeModifier(string modifier, string input)
{
return SymbolTable[modifier](input);
}
}
Extensions.Register("hashtag", (input) => $"#{input}");
public static class Invocation
{
public static string InvokeModifier(string modifier, string input)
{
if (Extensions.ModifierExists(modifier)) {
return Extensions.InvokeModifier(modifier, input);
} else {
switch (modifier) {
case "uppercase": return input.ToUpper();
case "lowercase": return input.ToLower();
default: return input; // hack
}
}
}
}
Okay, and one more idea that is actually quite cool as an extension API, but I’m not sure yet how to implement.
using Calyx;
using Calyx.Extension;
class MyExtensions {
[Modifier("hashtag")]
public static string FormatHashtag(string input, Options opts)
{
return $"#{input}";
}
[Modifier("studlycaps")]
public static string FormatStudlyCaps(string input, Options opts)
{
string result = "";
foreach (char c in input) {
result += (opts.Rng.Next(2) == 1) ? c.ToUpper(c) : c.ToLower(c);
}
return result;
}
[Modifier("explode")]
public static string ExplodeSpacesInString(string input, Options opts)
{
string result = "";
int length = input.Length;
for (int i=0; i<length; i++) {
result += (i < length-1) ? input[i] + " " : input[i];
}
return result;
}
}
We can use Reflection and custom attributes to find modifier methods, but this creates a situation where custom modifiers might be created with the wrong method signature. There's nothing stopping any method being tagged with our custom attribute, and all we can do is throw an exception at runtime if the signature doesn't match.
An even more OO-way might be to define each modifier as its own class, instead of as methods on some static class, where the method to run the modifier is a basic public string Modify(string input)
and any dependencies required can be injected in its (optional) constructor.
This removes the need to pass an Options
to methods that don't need, and allows for a modifier to call on more than just an Options
object.
public interface IStringModifier {
string Modify(string input);
}
[Calyx.Attributes.ModifierName("uppercase")]
public class ToUpperCase: IStringModifier {
// no need to write a default constructor
public string Modify(string input) {
return input.ToUpperCase();
}
}
[Calyx.Attributes.ModifierName("studlycaps")]
public class ToStudlyCaps: IStringModifier {
public ToStudlyCaps(Options opts, double fractionOfUppercaseLetters) {
// save the ctor params as privates
}
public string Modify(string input) {
string result = "";
foreach (char c in input) {
result += (opts.Rng.NextDouble < fractionOfUppercaseLetters) ? c.ToUpper(c) : c.ToLower(c);
}
return result;
}
}
One drawback is the need to register every subclass of StringModifier
instead of registering one static class containing multiple modifiers. I see this as slight.
It's probably also possible for end users to just send a reference to their own assembly and use reflection to register any class that inherits from `StringModifier.
Now that I think about it, this seems to make the custom attributes (edit: almost*) worthless as they only apply statically and we're now working with instances that'll need to be registered at runtime.
Is the increase in extendability worth it?
(*) we can still use it to couple the modifier name with the modifier instance, as long as we don't do something like:
var sc1 = new StudlyCaps(opts, 0.5);
var sc2 = new StudlyCaps(opts, 0.7);
Some great work on this. Closing as the overall feature and architecture questions at the macro level are resolved. We can pick this up again at the micro level by opening issues to deal with specific filter questions and implementation details.
Output modifiers (filter chains) format the string that is generated by the grammar production. They are defined by a chain of . separated references following the rule.
There are three separate stages of work needed to support this feature.
Specification
The first stage is to decide on the spec and documentation for the built-in set of string modifiers that can work across different language implementations of Calyx. Currently the Ruby and JavaScript libraries delegate directly to the runtime string objects for any method of arity 0. This means you get inconsistencies like
.upcase
in Ruby grammars but.toUpperCase
in JS grammars.The specific set of built-in modifiers needs to be normalised across all implementations.
C# modifier implementation
The basic behaviour to handle built-in string modifiers can be developed without needing the exact spec to be defined, starting with some basic
String
methods like uppercase and lowercase (names can always be changed in future to meet the spec, but we definitely know these two will be included).Note that modifiers also need to be chainable from left to right, so that the output of left hand side elements in the expression pipe into the right hand side elements, denoted by the
.
sigil. (I debated using|
here, but that is most commonly seen asOR
in grammar or logic notations and string template engines aren’t using it as frequently as they used to as a pipe/filter syntax, so I think the.
is more understandable—though open to argument on this).C# modifier extensions
What would a C# extension API look like? In Ruby, authors can embed a module with custom string methods to expand the allowed syntax with their own formatting.
In JavaScript, this feature isn’t fully documented or implemented but is very easy to do in either language because both offer many different ways of dynamically adding functions to an object at runtime. C# is very much not like that, so we need to define a more formal API for extensions.
A good example of a ‘hello world’ string extension I would like to document is a ‘studly caps’ or Spongebob mocking text filter. After getting some informal feedback on the generator API and naming conventions, I don’t think this should go in the built-in API, but it is a great sample code/documentation piece for demonstrating how the extension API works.