deanebarker commented 8 years ago

Should we establish a very low-tech method of inclusion for config files?

I'm to the point where I might want to have two config files: one for data validation, and one for the actual build. So, I'll run one to ensure all my data is good, then run another one to do the actual build. In some cases, I might just want to verify data, and not actually build.

I can easily have two configs, but then I have to repeat all the code at the top: the setup and declaration code. But, if I could have that in a separate file and just included in the two others, that would help. (Yes, I could write this code as a DLL, but that's a whole different thing...)

Do we just do something like this...

[[include: my_code_file.cs]]

...and then replace those tokens with file contents before attempting compile? There are probably more graceful ways to do it, but this would be low-tech and simple.

deanebarker commented 8 years ago

Or should we do it with a command line switch?

--setup-file my_setup.cs
--declaration-file my_declaration.cs

Then it would glue those onto the front of the config.wyam file? It would be a little less flexible, since you would have those sections in those files only, whereas with inclusion you could take bits and pieces from multiple files.

Do we do both?

daveaglick commented 8 years ago

Have you seen #73? Are you thinking this technique instead of that, in addition to, or as a stop-gap until it's ready?

deanebarker commented 8 years ago

Well, that's definitely similar. But where are those included? Are they (1) setup, (2) declarations, or (3) pipelines/config?

miere43 commented 8 years ago

73 obliviously better because you get rid of preprocessor which may introduce really weird untraceable errors.

daveaglick commented 8 years ago

@deanebarker The idea with #73 is that the included config files would "merge" with the files including them and the various setup/declarations/body sections would in the included file would end up in the correct corresponding section in the included file.

That said, it's obviously not a "low-tech method". Perhaps we just need an additional capability to say "stick whatever is in the file at xyz right here". My only concern is that if we end up doing both, it might be confusing which to use. And the latter might just be redundant once we have the former.

daveaglick commented 8 years ago

Now that preprocessor directives are implemented in #274, the idea of how this might work is taking shape. I'm most likely going to implement a directive #include or #load (not sure about naming yet) that finds the external script file and evaluates it (recursively processing any directives in the included file). Since we no longer have a setup portion, the timing question of when to run setup vs. body is no longer an issue.

The bits are in place now, so this will probably happen fairly soon.

daveaglick commented 8 years ago

Note to self: make sure to pass the context from one evaluation to the next (take a look at how Roslyn scripting does this for the REPL). I.e., if a global variable, class, etc. is defined in an included script make sure it's also available in the including scripts.

daveaglick commented 6 years ago

From @jonasdoerr in #605:

The idea of combining multiple config files (with one separate input path per file) has arisen when I thought about integrating documentation from several projects into one single webpage. Most of the projects have domain-specific data sources which are pulled by custom IModule implementations or Wyam modules with some lines of extra code in the config file. My single config file becomes more and more messy, since I have to reference a lot of custom assemblies.

I had a quick look at the code and it seems that the goal can be achieved with little effort and embedding the engine works straightforward. Roughly speaking, I would create one ScriptManager instance per script, generate and compile the transformed scripts and finally run them sequentially on a single Engine.

@daveaglick Are there any gotchas?

jonasdoerr commented 6 years ago

I'll have a look at it.

jonasdoerr commented 6 years ago

Before I start writing code, I'd like to discuss some aspects with you on how the new #include (or similar) directive should behave. Below are some thoughts and questions. I'm not fully convinced of every word I write, but we have to start somewhere. Please share any corrections, hints and ideas.

Using Statements should affect the own config file only. Otherwise, you can run into ambiguity problems when including other files. But aren't included namespaces visible to Razor scripts, too? Should Razor see the union of all namespace imports?
Script Code (local variables, control structures, access to FileSystem, manipulation of Pipelines, Settings, etc.) should be executed where the #include directive is located.
Method Declarations should be visible to the declaring config file only.
Type Declarations should be visible to other config files according to their access modifier. If each script is compiled into its own assembly, this features comes for free. If not, we have to see.
Extension Method Declarations: Same as type declarations.
#assembly, #nuget Directives should be treated concatenated.
#recipe, #theme Directives fail if used more than once anyway. Should be the same across multiple config files.
Cyclic Includes are bad, in my opinion, and should result in an error.
Repeated Includes: My first thought was No!, things cannot be configured twice. But as I read in other issues, users might want to put some classes or extension methods in a config file and use them in other config files or in Razor. Even if I don't think that the config file is the best place for shared code, it might be handy for others. The interesting part is the script code part (see above), e.g. the lines in the Run() method of the compiled script. Should we execute it at the first occurrence only? Do we need an #include_once like PHP has? I think the simplest approach is the best: execute it always. If a pipeline is created in the script, the second call will fail anyway, since the pipeline already exists at this point.
Directive Arguments: Do we need additional arguments for the directive? Such as additional input paths, e.g. #include ../other/config.wyam --input ../other/input?

Best Regards, Jonas

deanebarker commented 6 years ago

I am still getting notifications on this, but let me just chime in to say that I'll let you real developers work this out. I just pretend to be a developer every once in a while...

avishnyakov commented 6 years ago

I would also expect that themes would bring several wyam configs and it would work well at the end. Similar to recipes, I would like theme to carry a bunch of the wyam configs of which I can choose later while running Wyam.

daveaglick commented 6 years ago

@jonasdoerr Great questions! Here’s some initial thoughts (I’m sure I’ll have more as I think about it):

Using Statements

Agree about keeping the used scopes local to the specific config file. The Razor module automatically brings namespaces from module assemblies into scope, but it does not automatically propagate the #using namespaces from the config. Those need to be brought into scope within each Razor file again if needed, so shouldn’t be any conflict there.

Script Code

Also agree with your thoughts on execution ordering. The order that things are run should behave as if the included code was inline.

This brings up an interesting question about the mechanism by which the included script is compiled and evaluated. It’ll have to be run through Roslyn, and we’ll need to make sure the resulting in-memory assembly is available to modules (I.e., if an HtmlHelper is defined in a nested config file, it should be available and in-scope in Razor templates). The two ways I can think about doing this are:

Actually place the content of the nested scripts recursively into the parent script. To ensure correct scoping of namespaces, we’d want to enclose the whole nested script content in {...}. That might be troublesome because we’d also want any global declarations from the nested config file to be available in the outer config file. This approach would also require adding the appropriate #line directives to ensure compilation warnings and errors report out the correct line numbers and file name.
Compile each nested config file separately into it’s own in-memory assembly. Then those could be added as references to all outer scripts and passed to modules. We’d need to replace the #include directive in the outer compilation with the code to evaluate the newly created nested config assembly. One other benefit is that each nested config assembly could be independently cached like we do with the outer assembly.

More to come...

jonasdoerr commented 6 years ago

@daveaglick Perfect, so using statements won't make any trouble regardless of how we compose the parts from different config files. Let's focus on code generation and the options you mentioned.

When you say script, are you referring to the entire .generated.cs file or to the class inside that derives from ScriptBase or to the script code* that is injected into the class's Run() method? I assume it's the former.

Somehow, the first option you mentioned, e.g. merging all scripts in a single file, doesn't feel right for me. I think we should use as many analogies as possible. Therefore, one config file should result in one C# file with its own implementation of ScriptBase. Namespace includes could stay the same as now. I like your idea of separate assemblies, but a single assembly could be an option, too.

Single Assembly

Requires distinct Script names, either local or in namespace. By the way, the Script classes could be moved to a "secret" namespace such as __Script0. They don't have to be visible to Razor or modules in general.
Full visibility of type declarations. An included script has access to the types defined by the including script. Not sure yet if that's a good or a bad thing.
Naming conflicts at compile type.
Doesn't fit well with the current IScriptManager implementation. Code generation and compilation would need to be separated.
Only the merged result could be cached.

Separate Assemblies

Separate caches, very charming. I expect that the user changes one config file only in an iterative process. You probably know better what the time consuming tasks are. Do you expect a big impact on execution time when most of the scripts are already compiled?
An included script is caller-agnostic. Only the own and included type declarations are visible.
Naming conflicts at execution time (ambiguity).
Fits better with the current implementation.

At the moment, I prefer your approach, the latter one.

Finally, one thought on script execution. Given a script like

Pipelines.Add("my_pipeline");
#include ./add_modules.config

What do you think about the following execution code?

public override void Run(IIncludeContext context)
{
    Pipelines.Add("my_pipeline");
    context.Run(typeof(ScriptNameOfAddModules));
}

The include context may provide useful features like tracing and exception handling. It's important that the include call stack is clearly visible when it comes to error reporting.

statiqdev / Statiq.Web

Add include preprocessor directive #228

73 obliviously better because you get rid of preprocessor which may introduce really weird untraceable errors.