EvotecIT / OfficeIMO

Fast and easy to use cross-platform .NET library that creates or modifies Microsoft Word (DocX) and later also Excel (XLSX) files without installing any software. Library is based on Open XML SDK
MIT License
261 stars 47 forks source link

Stable and unlikely to change vs unstable and subject to change vs depreciated api #215

Closed tmheath closed 3 months ago

tmheath commented 4 months ago

I've started working on a html documentation for the public api with examples and simple getting started instructions using the nuget package in a variety of languages. Are any parts of the api unlikely to change and is anything known to be depreciated? Think it might be a good idea to lump anything stable together because it's unlikely to need changing.

officeimodocs.zip Work in progress, hardly there right now. Color scheme is ugly but only serves for structure right now, will change. It's all plain html/css and nothing fancy, if you have any feedback let me know... Assuming you'll want any documented hosted at github.io eventually?

PrzemyslawKlys commented 4 months ago

So to answer your question in short: I don't know.

Longer version:

As for the proposed API documentation I would prefer if the documentation was inline in the code that then we would use tools that extract documentation and create MD files. Those MD files then can be used by for example Hugo. Hugo has lots of documentation themes so one doesn't have to reinvent the wheel:

So the goal would be to document as much code as possible directly in the code base. Additionally we could use SHFB - i used it 10 years ago.

Finally some tools may help with this:

I don't have time right now to find out how they work and whether they are any useful but I guess in the end I'm open to direct documentation of the project + final static hugo website probably maintained on GitHub Sites with autogenerated code from source to MD files which are then "taken" by hugo engine.

Hope this helps?

tmheath commented 4 months ago

Thank you very much... This does... It's a lot of work to get up to that level but I've found the documentation for PySimpleGUI most impressive (I only just now getting that link found out, they've switched to a paid model for commercial use).

PrzemyslawKlys commented 4 months ago

Here's a another project with good documentation:

It's a lot of work to get to that level, but that's why you need automation. I myself use a lot of Copilot for everything plus Ghost Doc extension for Visual Studio which helps with autodocumentation. The rest is just to pick the right tools and maybe do some automation using PowerShell, get Hugo running. But like you said it's a lot of work :-) And OfficeIMO is one of my 50+ projects so I have things to do :)

tmheath commented 4 months ago

Hugo seems like it might be overkill, I've used the golang templates for a little local project I did, but I haven't touched them in a while, might be simpler just to write something small real quick.

I almost never use ai for code stuff, play around with it, once recently got it's advice to give someone else because of small issues in readability and that at least was reasonable... You might appreciate Ollama if you aren't aware of it, it makes things really easy to use. Honestly, I noticed things in your codebase that seemed ai generated, but due to all your projects I wasn't sure. You're very prolific, I'm both shocked and unsurprised with the ai, I've seen them spit out decent stuff, and outright wrong stuff so as yet I haven't considered it worth the time beyond boilerplate type stuff (I tend to waste more time just thinking over how to go about doing the thing).

Worst case scenario, I take way too long on this, but come back eventually with several different options, might just come back sooner with one depending on myself.

PrzemyslawKlys commented 4 months ago

The OfficeIMO project is not AI generated. However I do use it a lot. For example with this PR AI helped me a lot find out how to fix differences between my version and what MS proposed for version 3.0 - without it I would spend a lot more time.

What you see in OfficeIMO is OpenXML SDK Productivity Tool:

It allows you to open Word document and tell you how it's built with C# code to copy/paste.

That's why you see there's so many code that looks like AI generated it, but it isn't.

However I am now working on a project for DNSClient over HTTPS via DNS Wire protcol. I don't know anything about it, and with help of AI I was able to pull it off ;) Sure it's wrong a lot, but at the same time you can help it to achieve what you want.

tmheath commented 4 months ago

SandBox I've run into licensing issues over the 2019, not sure about 17, VSBuild Tools using from work, FOSS doesn't require it but I'd prefer avoiding that if I can, there's a grey area because this is FOSS, I had to go through a long sequence of workarounds to get rust working after an update they did a while back.

Right, I remember seeing those in the docs, I ran into trouble using those though... for some reason despite this codebase working fine cross platform, the OpenXML SDK does not play well with Linux, I have no concept of why or understanding any part of it.

PrzemyslawKlys commented 4 months ago

Here's quick query I do to copilot:

image

I encourage you to try it ;)

tmheath commented 3 months ago

I played around a little with gemma, going to try codellama... I think copilot was taking context from something else, gemma explained the dns query was asking for "example.com" while the other one I tried explained there wasn't enough context to determine more than it did (much better to understand that you can't say than to imagine nonsense)... The short answer is that I want everything I use hosted at least by myself.

I looked around at the tools you mentioned, think it's just simplest to write up a super simple python documentation script which I've started at this repo. Feel free to reject or accept this, when finished I'll submit a PR unless you've mentioned not wanting it.

PrzemyslawKlys commented 3 months ago

I don't know. If you look at Docs folder this is autogenerated.

It can be done with that. That means with little effort on DLL side by adding examples and so on inside as some libraries do it would be possible to get it all generated from that or similar app.

xmldoc2md "C:\Support\GitHub\OfficeIMO\OfficeIMO.Word\bin\Release\net472\OfficeIMO.Word.dll" "C:\Support\GitHub\OfficeIMO\Docs"

As for GitHub Copilot - it's a bit more than your dumb AI assistant: image

Select code, and tell it to fix rest :)

tmheath commented 3 months ago

With any luck I'll get through procrastinating and finally implement a python script or go program to give ollama the ability to run llms with memory and document use this week, no plans on running multiple together yet. That does look easy, I've been handling the problem by using multiple paradigms (F# where it makes sense, C# where it makes sense for example.... on an aside the DNS thing is probably something I'd use F# for because of the type system, but if it's more complicated than I'm thinking C# easily could be better, trying to give an example).

Going to look into xmldoc2md from home, ultimately everything done comes down to your choice, my issues with the others all came down to complexity, conversion of comments to usable documentation should be extremely simple and I'm having to read this giant manual for everything just to get anything done, I can throw a quick thing together in Python or Go and be perfectly at ease with cobbled html (a static css file would be more than enough to get it looking decent). I'd appreciate those tools if I actually needed many of those options, but I feel it's just easier to not have what's not needed, granted it might be easier in the future.

PrzemyslawKlys commented 3 months ago

My main thing about documentation is that I very rarely use one. I use examples in 90% of the cases, but then the documentation comes handy when i'm out of options.

The second thing is - I don't just want documentation. I want great websie with proper documentation that looks and feels nice just like those Hugo websites. This means my options are pretty much limited to:

  1. Find a tool or write one that will create markdown out of what's in sources
  2. Once having it in markdown find out what template is good enough and looks ok
  3. Find how the template expects those markdown files "tagged" so those can be integrated without manual work
  4. Write some piece of code that translates simple MD documentation into proper website
  5. Hugo basically works with markdown so it's more about placing it in correct place and adding things on top of file so they are recognized as pages.

I would rather not build totally custom solution because that's maintaince I would like to avoid.

tmheath commented 3 months ago

I figured you kinda wanted to do something like that, assumed it'd be fine to have a documentation link to take you to a sub site for the documentation but yeah, that's what I get for assuming. I've got to look into anything related to hugo, this marks the third time trying to break into using it and so far each time I'm pushed back by the volume of features.

The Python script I started already has the boilerplate in place for any generic file, C# just started getting implemented but I want to read some books I have at home to make sure I'm doing that the "right" way, figured it'd be generically useful more than just here as I'm abstracting the language and the output from the tool so you can feed (C#, F#, go, C, C++, Python, etc) into (MD, docx, latex, pdf, text, etc) Barring implementation.... In other words rather than being super complicated with tons of features where anyone project only uses at most a handful deep, exceptionally broad but very simple, honestly it might be better to handle language based customization from short configuration files honestly. In this way it's a single tool that does exactly the same thing no matter what, pipe code into it and output formatted code or run it on a file(s). I wouldn't place my faith in me if I were you though, have enough of a problem with procrastinating... I'll work on it and keep an eye out.

tmheath commented 3 months ago

Honestly was considering swapping over to go if performance becomes a problem, which it shouldn't. I already know I can go ahead and process the objects while scanning input if data is being piped in but I wasn't worried about that.