dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.85k stars 4.01k forks source link

Allow creating mixed language projects #26765

Closed ghost closed 6 years ago

ghost commented 6 years ago

Allow creating mixed language projects In ASP.Net, e website can have pages written in VB.net and others written in C#> I suggest you allow other project types to do the same, so it can have files written in different languages such as c#, f#, vb.net, Q#, VPL, and any other .net language. Each file is distinguished by its extension like .cs, .vb, ….. etc. This will make maximum code usability without making any more effort to convert some code to a class library, especially if it has a user in interface. This sort of projects can have a different template not to affect current project templates.

HaloFour commented 6 years ago

21024

HaloFour commented 6 years ago

IIRC Roslyn was never designed with this in mind and it's probably not possible without a rewrite, which ain't going to happen. Roslyn only covers the two languages C# and VB.NET, but even then those two languages have their own incompatible syntax trees.

sharwell commented 6 years ago

Duplicate of #21024

ghost commented 6 years ago

@HaloFour

but even then those two languages have their own incompatible syntax trees.

I don't get it. Why should they be compatable?.. Each compiler will compile its own file separately, then all goes to IL. What is the defferance between a solution having a C# project and a VB.NET project with one of them linked to the other?.. When you debug the solution, you can trace code through the two of them. Just do that inside one project.

ghost commented 6 years ago

I have another idea, but it is more difficult: I wish I can write functions with different languages in the same file, using some attributes to mark each function with its language! Further more, I want some day I can write blocks of code in the same function with any languge, where eachlanguge code is put in some #language block! I think mixing different language files is way easier!

HaloFour commented 6 years ago

@MohammadHamdyGhanem

That's not done on a file-by-file basis. The compiler has to treat the entire project as a single unit and resolve everything before any IL is emitted.

ghost commented 6 years ago

@HaloFour Nothing starts perfect. Let's say it will be just a wrapper for creating 2 pjojects in a form of one! Let's keep the poundry between the 2 languages and make some rough roles: Each project has a primery language (say C#). Any other languhe files (say VB.NET files) are not allowed to use any code from C# files (other wise VB.NET compiler will complain about unknown names), but C# files can use the public types from those foreign files. This one way link will allow to compile VB.NET files first becuase they are independent on C# files, then C# can compile its own files depending on the IL that VB.NET generated. These are the same steps that happen when we reference another project in the solution. Of cource one can choose to make VB.NET the primary lang and add C# files which will be compiled first.... etc. This may not be too much, but it is a step that will make some things easier, and can be built upon in the future.

CyrusNajmabadi commented 6 years ago

@MohammadHamdyGhanem i'm interested in this suggestion. If you could work on a prototype that would be very helpful. It would help prove out the viability of this suggestion, while also helping to identify hte places that would need significant design work.

CyrusNajmabadi commented 6 years ago

Nothing starts perfect. Let's say it will be just a wrapper for creating 2 pjojects in a form of one!

Why have this when you can just have two projects? All the same restrictions are in place... so nothing really valuable has been gained.

CyrusNajmabadi commented 6 years ago

Each compiler can talk with the other as

You did not complete this.

CyrusNajmabadi commented 6 years ago

but C# files can use the public types from those foreign files. This one way link will allow to compile VB.NET files first becuase they are independent on C# files, then C# can compile its own files depending on the IL that VB.NET generated.

This is already supported today. You can just have two projects. If you want to make a new type of VS project that simulates this, you can def go ahead and try to do that. VS is extensible enough that you likely could make this work by having a pseudo-virtual project that really just became two projects behind the scenes**. Though i don't really know what this would buy you as all you'd be doing is saving one icon in VS.

--

** Note: this is basically how TypeScript works in VS when you open loose files that are not backed by tsconfig files. They transitively figure out the set of files to compile, form "virtual projects" out of that, and then host those virtual projects in a top level hierarchy.

ghost commented 6 years ago

Each compiler can talk with the other as .... In next phase. :) Phase one as I think can be easy and of low cost.About why is this needed: Suppose I only need a small feature of VB.NET Like XML strings. Should I creat a new ptoject, compile and reference it in C# project? Or I just add one .vb file to the same project? If this phase is done, it will be easy to make @paul1956 idea possible :

Dim X as Byte = CSharp(uncheck(x => 7 * 100);)

I modified his code to fit my roles. C# (which is the Secondary Compiler here) will compile all CSharp() blocks first, making the result as a public member, then VB.NET can use the generated IL to compil its code as usual. No language can look at the other's syntax tree or what ever.

ghost commented 6 years ago

@CyrusNajmabadi I'm not expert in compilers and project tamplates. Implementaion in these areas is noy suitable for me now.

HaloFour commented 6 years ago

@MohammadHamdyGhanem

You can't compile IL in a vacuum. I believe the smallest self-contained unit would be a module, but a module contains full types, not individual functions. On top of that neither the C# nor VB.NET compilers know how to do anything with IL. They work from the metadata embedded in an assembly. They have no mechanism at all to take an arbitrary chunk of IL representing a function or an expression and interpret anything out of it.

Don't get me wrong, I think it's an interesting idea. It just needed to be brought up 8-10 years ago when Roslyn was still on the drawing board. Trying to retroactively add it into the compiler infrastructure would represent a significant amount of work. The reward for doing so is at best dubious as there is little evidence that mixing languages within a project would bring much benefit, especially when only two languages (at best) would be supported. Being able to mix C# and F# would be a more interesting scenario, but F# is a completely separate compiler that shares nothing with Roslyn.

And if you really, really need to mix languages, you can use tooling to take multiple projects and "merge" them into a single assembly. That approach is more involved but has the massively distinct advantage of not requiring anything of the compiler, so you can feel free to mix Oxygene with COBOL.NET all day long and neither compiler will be the wiser.

CyrusNajmabadi commented 6 years ago

I'm not expert in compilers and project tamplates.

I'm happy to help try to walk you through it. If you are passionate about this, it would be the quickest path to it happening.

ghost commented 6 years ago

@HaloFour @CyrusNajmabadi I think a project with mixed C# and VB.NET files can be possible even with each language code is depending on the other's (two way link). This can be as follows: 1- Calling a public member from the other language need to be marked with the keyword @external:

@external.foo.DoSomething();
@external.foo.Username = "anything";

2- Phase 1: Each language checks syntax errors as usual but ignores @external part (consider that these members exist in this phase, and check all code line based on this assumption). 3- VB.NEt and C# Compile there code to IL. Each language generates a metadata that describes the public members in the code that it compiled. 4- Phase 2: VB uses C# metadata to check the @external parts, and C# uses VB metadata to check the @external parts. When every thing checks out, the IL from the two languages combined in on exe or dll and all @external will be eliminated from that IL code.

I think this these steps don't require radical changes in VB or C# or Roslyn. Each compiler is still isolated from the other, and completely ignorant about its structure and symantics and whatever. VB.NEt and C# are black boxes, and the communication between them is done by the IL metadata, in a two phase compiling.

CyrusNajmabadi commented 6 years ago

Definitely looks like something you could create a prototype for! I look forward to seeing it :)

ghost commented 6 years ago

@CyrusNajmabadi I thought the idea behind the community is working together. If I had to learn and work in every area of dot net alone, it wouldn't be much of a community! I have many ideas and to do tasks, but I fall out of the deadlines of some obligations, so waiting for me to sort out my issues, learn new things and implement each idea I suggest here will take years!. I simply can share ideas now and everyone is free to build upon if it is of any value. Thanks.

svick commented 6 years ago

@MohammadHamdyGhanem Working together doesn't mean that you can just make suggestions and assume others will do all of the work.

Many of us have lots of ideas of our own, and not enough time to implement all of them (I know I do). If you have an idea and don't manage to convince other members of the community (paid by MS or not) to do the work, your best option is to lead the work yourself.

ghost commented 6 years ago

@svick Fair enough. But if I may enter the realm of compilers, the first thing I will do is to create a new .NET language in Arabic syntax, and implement all my suggestions in it!. This will make me do nothing else for years, trying to keep track of new .net features! Right now, I want to focus on ASP.NET and its upcoming Blazor. There are other new technologies I want to learn about such as Xamarin, Docker containers, and cloud programming. Going low level seems a difficult choice for me right now. Although, I'm working on improving the Regex Builder and it is nearly completed. I have an issue escaping chars, because the Regex.Escape Method doesn't consider the case of putting the pattern in [], so it does escape the [ itself and chars inside the []!.. I will write a function to solve this now, and revise all the Verex methods to decide which type of escaping should be used. I will publish this after it's stable, so others can test and modify it. A Question: how do you notify others about your github projects? Is there a place or a means to tell the .net developers about this? Or is there any tags or what so ever that gives them a notification? Thanks.

CyrusNajmabadi commented 6 years ago

I thought the idea behind the community is working together.

Sure. but htat doesn't mean just throwing out ideas and hoping people will work on them "just because". The team and community members are already working on the enormous set of work felt to be important. If you want new stuff to be considered, you need to at least be willing to do some of the leg-work yourself :)

As i mentioned previously, i would be happy to help out with advice/info. If you started working on this stuff and it proved viable, i would also be willing to help out there. But you've at least got to get teh ball rolling, and not expect others to just do it for you.

seriussoft commented 6 years ago

@CyrusNajmabadi I'd be interested in working on a template or prototype of a mixed compiler or at least a inverted compiler (if in a csharp project, your alien code would be vb.net files or visa versa) such that c# can be converted to vb.net and visa versa. I'm thinking about taking advantage of the individual file compiler option available in VS. It's been used for T2 templating as well as for blocking "alien code files" (see definition above) from compiling with the project, instead compiling it individually, and then allowing its use in your project language classes. I've got a bit of info linked in the related issue that's still open: #21024 , but I'd be approaching it similar to the many single-file-compilers add-ons such as Xsd2Code and the RazorCompilers.

You offered assistance to @MohammadHamdyGhanem on compiler info. I'd love to take you up on that offer if it is still on the table, and I would be extremely grateful. I'm passionate about having mixed language projects. I'd really want to just start with support for C# and VB.NET interoperability or a faux/emulated understanding of eachother such that you can run small VB.NET code-files/classes within a csharp projects and visa versa.

To me, there are a many great features in either language that is not supported, and much to my chagrin, has been marked for "no support" in opposing languages. To me, while C# is my preferred .NET language, VB.NET has some awesome abilities that greatly speed up development of things such as XML integration (XML-to-LINQ), the 'with' keyword for readability, the optional param skipping specific to COM+ objects, and other features that my 3AM brain on a Saturday cannot recall. At the same time, I often work in whatever project my clients had made beforehand, and often live in the VB.NET realm where I wish I had access to a few C# features and/or syntax for aiding both readability and usability (especially when it comes to system32 integrations, native dll extern references, and dynamic script execution to name a few). In many cases, it doesn't make sense to have separate projects (especially as the clients don't need a bunch of code from another language where a simple small class would suffice) and you start to get into these nasty circular dependencies when your alien project has to talk back and forth with your current project - something that's not an issue if the 2 pieces of code live within the same project and module.

Anyways, I feel there's a justification here for it, as do others. I'm willing to stake some of my time and reputation on getting a prototype going, but will definitely need help in the area of compilers, linters, and/or whatever methodology would allow a quicker prototype that doesn't back the next step into a corner. I'm very familiar with parsing - I built a statistical based math parsing engine for a Survey CMS project as part of my initial coming-of-age or first big client project. It spanned many years, started while I was in college and ended several years after landing a couple jobs. Said Survey CMS engine supported addition of new math paradigms with module system for adding them to the engine and capable of handling multiple traversals and simplifications across all active or streamed formulas to produce a graphical report of your survey results.

That said, it was a modular parsing engine based on the typical "build a calculator with a math string parser" project and blown way out of proportion. So I am aware that programming language parsers are a somewhat different entity, but have experience with a token based parser and emitter used with an add-in creation for Word to build tokens that represented a report and would run against real data as well. I'm aware of and can partially read the IL code emitted by reflector over an assembly, have past experience (limited) with 8bit and 16bit assembly (masm8 and masm16) and am more aware than I'd like to be about setting up a remote compiler and using ruby to aid i the publishing and signing.

Soooo, that gives me enough knowledge to know I'd be in over my head if I don't have a mentor in the realm of compilers. Still interested enough to help out with questions, pointers, and some initial references?

Again, the referenced issue is #21024 ( in case the issue link doesn't work like it's failing in my editor now, here's the full link: Mixed Language Build in One Project )

CyrusNajmabadi commented 6 years ago

You offered assistance to @MohammadHamdyGhanem on compiler info. I'd love to take you up on that offer if it is still on the table, and I would be extremely grateful. I'm passionate about having mixed language projects. I'd really want to just start with support for C# and VB.NET interoperability or a faux/emulated understanding of eachother such that you can run small VB.NET code-files/classes within a csharp projects and visa versa.

It's definitely on the table. :) What is it you're looking to try to figure out wrt "Compiler info"?

ghost commented 6 years ago

@seriussoft There is another reason that makes mixed lang important. It can be used to make Rszor and Blazor support VB out of the box. By the way, I dream of a VBlazor, where we can design web pages with just xaml and vb.net Code: https://github.com/dotnet/vblang/issues/329

seriussoft commented 6 years ago

@CyrusNajmabadi Great! And thank you in advance. 😄 So I've got experience in parsing, tokenizing, pre-compilation character formatting, unix-characters over ascii, and the likes. However, I'm not all that familiar with what I've discovered and briefly read in the dotnet/CSharpLang Lexical-Structure page. It says,

The lexical and syntactic grammars are presented in Backus-Naur form using the notation of the ANTLR grammar tool.

I learned 8bit and 16bit ASM, so I'm able to relate said experience a little to CIL code for both reading/writing and could likely become more effective in reading/writing in it as necessary. I think that ultimately, I am looking for starting points on getting to know the Roslyn C#/VB.NET compilerr process without spending the next year trudging blindly through the Roslyn codebase. Are there any great combo articles, smaller repos, or even books/eBooks/whitepapers that you know offhand I could use as my starting point? I taught myself how to build a compiler, some parsers, and a basic lexer on my own and so learning it without the jargon and compiler/lexer fundamentals - hence a weak point for me.

When working on the tokenizer at one of my previous employs, it was a combo of the MS Word token system and a lot of fluff on our end that was completely home-brewed. I essentially had to read through very old code and teach myself through trial-and-error as nobody really understood exactly how it worked or how integrate new tokens, token-groups, and token-value-swapping on the fly.

I do not mind researching, reading, trial-and-error, and whatever else is necessary to jump right in. But, I want to make sure that I have a grounded approach so that I'm not painting myself into a corner.

Do you know if any of these might be helpful? I'd already picked up 1 or 2 of these and have considered another one because I ultimately want to create my own dynamic DSL(s) and ALSO create a free, fully featured IDE (not just a fancy editor) that is cross platform AND supports multiple languages such as the CLR family, php/python/r/ruby and also user-created/home-brew DSLs). To me, this project is a step in that direction and also has the benefit of helping other developers (not just myself) in their endeavors to learn new languages, teach new languages, become more proficient/versatile, and finally allow contract and some corporate devs to quickly add prototypes and full fledged features to older apps they acquired, but are not in their language of proficiency and rewriting said app is out of the question.

While I may not need to know the nitty gritty for this prototype as Roslyn , Reflection, and module loading in domains do enough work that I may be able to get something going with a high-level understanding of what's going on... at the same time, I feel knowing a little more than just the basics will help should I get stuck or confused. I'm familiar with building C and C++ projects (have written my fair share of makefiles and used the gcc commiler while learning C++ back in the day and even used a couple free IDEs that used the gcc compiler (Mingw port I believe), and as I've worked intensively with many C and C++ open source projects trying to integrate them into .NET (a WoW server, ImageMagick to name a few) or provide .NET facing APIs, I'm familiar with building, linking, and compiling in the broad sense.

I've no experience with ANTLR and only accidently followed the Backus-Naur notation for my Survey CMS's formula/equation/macro engine. This realization is based on a quick search over the term and a couple youtube videos, so I don't claim total confidence in the subject. However, the concept of terminal/non-terminal, recursively replacing non-terminals until you have only terminals, and a priority/weighted system were all built-in to my custom math DSL - sadly before we had the ability to compile and link C# code from a string for running directly. But, the positive, outside of the experience and fun that came with it, is that I taught myself how to do it and didn't even realize what I was doing had already been discovered and standardized!

Now that you have an idea of where I'm coming from and where my experience in this particular niche problem lies, here are my starter questions:

Questions

  1. Do any of these books sound like they would be helpful (or are helpful based on your experience)? I have a couple of them, but am only just recently started in them
  1. Are there any other books that make a great read and would help me to overall understand the topic in relation to what is going on under the hood in Roslyn, or any whitepapers, compound articles, or smaller repos for that matter that I could dig into?

  2. I'll post in a separate comment below, but my plan is to aim for the single-page compilation technique whereby we try to utilize the T4 Template trick - Template page on top. Marked as no-compile. Then, attach a single-page compiler to it. I'll go into more detail in the next comment, but does this sound like an approach that will meet the needs of proving plausibility, feasibility and usefulness as necessary per such a prototype?

CyrusNajmabadi commented 6 years ago

Let's scope this down much smaller, so this won't be a major treatise :)

I would recommend starting to learn a little about the core Roslyn syntax model: https://github.com/dotnet/roslyn/wiki/Getting-Started-C%23-Syntax-Analysis

There's also a lot of good stuff on the wiki. Also, github issues aren't likely to be the best way to discuss this stuff. I'd recommend you pop over to https://gitter.im/dotnet/roslyn to be able to talk about things more and ot be able to have a better back and forth toward what you're trying to accomplish.

seriussoft commented 6 years ago

@CyrusNajmabadi @MohammadHamdyGhanem

My approach or battle plan is as follows, starting with a project that actually proves the plausibility of such an endeavor:

IL Support Plug-in/Add-In/Extension

Here's an example of a recent project that is really cool and does an approach similar to what I'd be looking to do for my prototype. It also proves, to an extent, that this is plausible, leaving me with proving whether it's feasible or not:

My thought process is that we have the DLR (custom tool for Dynamic Languages) for the "Iron" languages (IronPython, IronRuby, IronLisp [almost], and another that was dropped) as well as other CLI-linked languages such as Phalanger/Peachpie for PHP written targeting the CLI (and last I heard were updating their custom tool/compiler, Phalanger, to utilize the DLR as well). So ultimately, it's plausible that we would need a separate tool just for combined-language project compilation as well, which is essentially what I'm going to try to run with as my prototype. It will start out following the single-page-compiler technique uses both in the IL Support project above as well as the multitude of Razor->C# single-page-compilers/T4-Engine-Replacement projects I've seen (and tried) all over the VS Marketplace.

Thought Process

Legend
  1. Native vs Alien Language 1.2. Native language - the language that the project is hosting. 1.2. Alien Language - the language(s) of the single files that doesn't "belong" in the project, yet...
Current Battle Plan
  1. Utilize the T4 Template, 2-page technique. 1.1. Top Page is not marked as compile. It will be in the Alien language. 1.2. A separate compiler (or compiler wrapper or tool) will be marked as the individual compiler. 1.3. Upon leaving the file (saving and/or closing), it will be recompiled. 1.4. The compiled file will be in the native language. 1.5. The new file will then be added as the under-page or the page underneath the Alien Language file in the solution hierarchy. It will either be C# with any necessary metadata (such as attributes and XML comments) included, or in another representation - possibly as a referenced compiled file with symbols. I'd prefer the C# route or a non-compiled route, if possible, but have a backup plan if this isn't possible...

  2. To talk to the now compiled code, we have 2 options: 2.1. If the file is just a plain Native Language file (as in we compiled the Alien Language file, then reversed it back to the Native Language, then we don't have to do anything else. by utilizing partial classes and the likes, we can easily have our new code talk with our Native code as if we had typed only in Native... 2.2. IF we have to fully compile the Alien file because it has features that cannot be mimmicked or rewritten in the Native language, then we will have to utilize what ins0mniaque does in the IL Support tool: use the extern modifier for methods, create external aliases for the classes that we want to share, provide friend access (if necessary only) to the outside compiled ALien library, and use the appropriate late-binding/late-loading attributes to tell the compiler to look no farther (

    [MethodImpl(MethodImplOptions.ForwardRef)]
    public static extern DoSomethingInAlienLanguage(params Object[] args);
  3. Should we need to do magic such as building a separate lib for the Alien Language instead of build-decompile-to-native-language(), then we would also do the same in the alien language file (assuming there is a way to do so in languages outside of C# such as VB.NET (i'm not certain because it's been a while since I referenced an outside external, unmanaged assembly or COM+ object in VB.NET).

In this way, we wouldn't get to have full error protection during build-time for the cross party stuff. However, we could build in some form of protection so that as long as we create our intermediate models/POCOs to match existing classes, we can look for and compare them to make sure they match. We would still likely have a lot of dirty-feeling errors during runtime as we start out, much to the point that we still can get with the dynamic keyword or when using objects for COM+; However, I believe that we can overcome those at a later step.

  1. This means we have access to standard syntax highlighting, can add in the linker/extern methods and external alias models/POCOs/structs when we need to mark something that can talk back and forth between Alien and Native langauges. We may have to provide some tweaks to get it so that vb.net files don't show red squirrelies in C# and visa versa (for example), but other than doing some Roslyn based UI fixes, it should (in theory, or as my physics professor would say, in the "perfect physics world") just work.

So far, that's my battle plan. If you'd like a better explanation or to answer questions you might have, feel free to hit me up. I've only been thinking about this approach for a couple months, so it isn't cemented in or anything...merely my first real attempt.

seriussoft commented 6 years ago

Alright, @CyrusNajmabadi , thanks. Hopping over.

VBAndCs commented 5 years ago

This proposal may make it happen at some scale: https://github.com/dotnet/roslyn/issues/34821