jamescourtney / FlatSharp

Fast, idiomatic C# implementation of Flatbuffers
Apache License 2.0
497 stars 50 forks source link

Complex include hierarchies is no longer supported #266

Closed ghost closed 2 years ago

ghost commented 2 years ago

As of FlatSharp 6.0.0, the flatbuffer schema include path resolution is more restrictive (due to the use of flatc).

If you have a structure of flatbuffer schema files with dependencies A > B > C, then this used to work in FlatSharp 5.x.x:

This format was not compatible with flatc anyway.

When attempting to compile this structure using flatc, the include paths must be specified like this:

In addition, you need to pass in the -I flag to the flatc compiler so that it knows to search for includes in the specified path (in this example, the parent directory where A.fbs resides).

Is there some way we could support the -I flag for flatc?

My thoughts were that if the FlatSharpSchema property were to alternatively accept a Directory containing *.fbs files, then it could

  1. use this directory path as the includes directory (flatc -I <directory>), and
  2. find and compile all *.fbs files in this directory/subdirectories (can be done in parallel)

This would also csproj configuration significantly less cumbersome in many cases.

jamescourtney commented 2 years ago

Thanks for pointing this out! It's not a regression that I had noticed, but I assumed there would be something when making the switch to flatc.

That seems possible to me. We'd need to update the FlatSharp compiler and targets files to allow specifying this as a .csproj property that gets tunneled to the command line.

I'm interested in doing a directory-based approach, but we need to be careful not to make it a breaking change. I can envision something like this:

<ItemGroup>
   <!-- process all fbs files in the given directory -->
   <FlatBufferSchemaDirectory Include="SomeDirectory\" />

   <!-- the old way -->
   <FlatBufferSchema Include="SomeFile.fbs" /> 

   <!-- What happens here? Does foo get processed twice? Should this be allowed? -->
   <FlatBufferSchema Include="SomeDirectory\Foo.fbs" />
</ItemGroup>

The other option would be something like this that gets transformed into a command line argument to FlatSharp.Compiler:

<PropertyGroup>
    <FlatSharpIncludePath>c:\fbs;c:\otherdirectory</FlatSharpIncludePath>
</PropertyGroup>

The biggest problem I can see there is that there is no way to specify include paths per FBS file. That doesn't sound like a tremendous drawback to me, but I'm interested in your feedback.

Once we settle on a design, would you be interested in making this change?

ghost commented 2 years ago

No problem!

Those are interesting points.

I mocked up a directory-based approach before raising this issue, and it worked reasonably well, however, you have a good point that you could end up with duplicate FBS files. I didn't realise that the ItemGroup was pre-processed to remove duplicates, but of course this makes sense and could otherwise be problematic.

Having a global property is a valid option and would work for my use-case, but perhaps we can come up with something that allows an include path per FBS file.

What about this third option?

<ItemGroup>
  <!-- schema file with optional include path -->
  <FlatBufferSchema Include="SomeDirectory\Foo.fbs">
    <IncludePath>SomeDirectory\</IncludePath>
  </FlatBufferSchema>

  <!-- the old way -->
  <FlatBufferSchema Include="SomeFile.fbs" />

  <!-- not processed -->
  <FlatBufferSchema Include="SomeDirectory\Foo.fbs" />
</ItemGroup>

The only problem I can think of with this third option is that it would be possible to specify duplicate FBS files with differing include paths (as in my example), and so the include path passed through to FlatSharp.Compiler would be dependent upon the processing of these duplicate entries. An incorrect ordering of these entries would result in the "wrong" include path passed through to FlatSharp.Compiler and subsequent compilation failure. I would just consider this user-error, and the problem should be easily traced based on the console output.

Obviously this third option could end up being very verbose with a large number of FBS files, even though it would solve the issue and we could stop here.

...

But let's keep going!

What if we take it one step further and were to support wildcards? From my limited understanding, I think using wildcards is a somewhat more conventional approach for msbuild. If you were to combine these two features, something like this should be possible:

<ItemGroup>
  <!-- wildcard support with optional include path -->
  <FlatBufferSchema Include="SomeDirectory\**\*.fbs">
    <IncludePath>SomeDirectory\</IncludePath>
  </FlatBufferSchema>

  <!-- the old way -->
  <FlatBufferSchema Include="SomeFile.fbs" />

  <!-- not processed -->
  <FlatBufferSchema Include="SomeDirectory\Foo.fbs" />

  <!-- not processed -->
  <FlatBufferSchema Include="SomeDirectory\Bar.fbs">
    <IncludePath>SomeDirectory\</IncludePath>
  </FlatBufferSchema>
</ItemGroup>

I am assuming here that it is possible for all of the paths to be resolved, duplicates removed, and individual FBS files passed through to FlatSharp.Compiler with the corresponding include path. Any duplicates that exist with differing include paths could suffer the same problem as above, which I am not particularly concerned about.

What do you think?

I'd be interested in making these changes once we settle on a design - and learning a bit about msbuild in the process!

ghost commented 2 years ago

I've done some reading up on MSBuild and this actually seems easier than I first thought, since wildcards are inherently supported.

With a folder structure like

The following example does exactly what I was envisioning:

<Project>
  <ItemGroup>
    <FlatBufferSchema Include="SomeDirectory\**\*.fbs">
      <IncludePath>SomeDirectory</IncludePath>
    </FlatBufferSchema>

    <FlatBufferSchema Include="foo.fbs" />

    <FlatBufferSchema Include="SomeDirectory\bar.fbs" />

    <FlatBufferSchema Include="SomeDirectory\SubDirectory\baz.fbs">
      <IncludePath>SomeDirectory\SubDirectory</IncludePath>
    </FlatBufferSchema>
  </ItemGroup>

  <Target Name="FlatSharpFbsCompile" BeforeTargets="ResolveAssemblyReferences">
    <!-- required to remove duplicates with different IncludePath properties -->
    <RemoveDuplicates Inputs="@(FlatBufferSchema)">
      <Output TaskParameter="Filtered" ItemName="FilteredItems"/>
    </RemoveDuplicates>

    <Message Importance="High" Text="flatc &quot;%(FilteredItems.fullpath)&quot;" Condition="%(FilteredItems.IncludePath) == ''" />
    <Message Importance="High" Text="flatc -I &quot;%(FilteredItems.IncludePath)&quot; &quot;%(FilteredItems.fullpath)&quot;" Condition="%(FilteredItems.IncludePath) != ''" />
  </Target>
</Project>

Output:

flatc "C:\src\TargetsTest\foo.fbs"
flatc -I "SomeDirectory" "C:\src\TargetsTest\SomeDirectory\bar.fbs"
flatc -I "SomeDirectory" "C:\src\TargetsTest\SomeDirectory\SubDirectory\baz.fbs"

This may be obvious to those familiar with MSBuild, but anyway, this was enlightening to myself!

It should be relatively straight forward to support this if you are happy with it.

jamescourtney commented 2 years ago

That makes sense to me, and that seems like a reasonable proof of concept. Thanks for all the effort! I'd welcome a PR whenever you want to submit it.

My only thought for implementation is that IncludePath needs to accept semicolon-delimited paths, rather than just assuming it's a single directory. I assume flatc allows you to pass multiple paths or multiple -I arguments. Let's not restrict it to just the one.

jamescourtney commented 2 years ago

I think we can call this one fixed :)