feature request: save on dispose option

wtertinek / Linq2Acad

A library that aims to simplify AutoCAD .NET plugin code

MIT License

61 stars 34 forks source link

feature request: save on dispose option #8

Closed sanzoghenzo closed 3 years ago

sanzoghenzo commented 3 years ago

Hi, in the pursuit of using this wonderful library in accoreconsole, I would like to add an option to save the drawing during the disposal of AcadDatabase.

As of now, I'm using something like this:

using (var db = AcadDatabase.Open(dwgPath, DwgOpenMode.ReadWrite))
{
  DoSomethingWIthDB(db);
  db.SaveAs(dwgPath);
}

But this will save the drawing before committing the transaction. It works, but I'm not sure is the best thing to to.

I was thinking of

adding a saveOnDispose field in the AcadDatabase class;
add the Database.SaveAs(path) to Dispose() if the filed is true;
add the parameter saveOnclose to the constructors and static methods.

Do you think that this is a good way to go? If yes, I can work on it and submit a PR.

wtertinek commented 3 years ago

Hi sanzoghenzo,

great, this is a very interesting use case!

Yes, your proposed way to implement this feature is a good way to go. The internal implementation can be done like that. Maybe we can call it saveOnCommit? Because we only want to save in the case of a commit, not on each dispose.

For the public API I want to share some ideas how the factory methods could be refactored such that we can use saveOnCommit for the Create factory methods as well. So, candidates for the use with saveOnCommit would be:

public static AcadDatabase Create();
public static AcadDatabase Create(bool keepOpen);

public static AcadDatabase Open(string fileName, DwgOpenMode openMode);
public static AcadDatabase Open(string fileName, DwgOpenMode openMode, bool keepOpen);
public static AcadDatabase Open(string fileName, DwgOpenMode openMode, string password);
public static AcadDatabase Open(string fileName, DwgOpenMode openMode, string password, bool keepOpen);

But saving the file only make sense when the database is newly created or read/write. For newly created databases we don't have a file name, so we can simply add a parameter that provides the file name:

public static AcadDatabase Create();
public static AcadDatabase Create(bool fileToSaveToOnCommit);
public static AcadDatabase Create(bool keepOpen);
public static AcadDatabase Create(bool keepOpen, bool fileToSaveToOnCommit);

For the second case, a database that is opened read/write, I like your suggestion to use a boolean parameter which indicates that the database should be saved on commit. One thing we have to consider is that the parameter makes only sense if we open a DWG file read/write, saveOnCommit would crash on a read-only file. To distinguish Open (read-only) from Open (read/write), I would not hesitate to get rid of the openMode parameter and rename the methods to OpenForRead and OpenForWrite:

public static AcadDatabase OpenForRead(string fileName);
public static AcadDatabase OpenForRead(string fileName, bool keepOpen);
public static AcadDatabase OpenForRead(string fileName, string password);
public static AcadDatabase OpenForRead(string fileName, string password, bool keepOpen);

public static AcadDatabase OpenForWrite(string fileName);
public static AcadDatabase OpenForWrite(string fileName, bool fileToSaveToOnCommit);
public static AcadDatabase OpenForWrite(string fileName, bool keepOpen);
public static AcadDatabase OpenForWrite(string fileName, bool keepOpen, bool fileToSaveToOnCommit);
public static AcadDatabase OpenForWrite(string fileName, string password);
public static AcadDatabase OpenForWrite(string fileName, string password, bool fileToSaveToOnCommit);
public static AcadDatabase OpenForWrite(string fileName, string password, bool keepOpen);
public static AcadDatabase OpenForWrite(string fileName, string password, bool keepOpen, bool fileToSaveToOnCommit);

But now we have a terrible explosion of overloads, which cries for either optional parameters or separate options classes. IMHO the cleanest way would be to have two method overloads for Create, OpenForRead and OpenForWrite: one overload for the default case and one overload with additional options. Option classes highly improve the readability of argument lists.

public static AcadDatabase Create();
public static AcadDatabase Create(CreateOptions options);

public class CreateOptions
{
  public bool FileToSaveToOnCommit { get; set; }

  [System.ComponentModel.EditorBrowsable(System.ComponentModel.EditorBrowsableState.Advanced)]
  public bool KeepDatabaseOpen { get; set; }
}

public static AcadDatabase OpenForRead(string fileName);
public static AcadDatabase OpenForRead(string fileName, OpenForReadOptions options);

public class OpenForReadOptions
{
  public string Password { get; set; }

  [System.ComponentModel.EditorBrowsable(System.ComponentModel.EditorBrowsableState.Advanced)]
  public bool KeepDatabaseOpen { get; set; }
}

public static AcadDatabase OpenForWrite(string fileName);
public static AcadDatabase OpenForWrite(string fileName, OpenForWriteOptions options);

public class OpenForWriteOptions
{
  public bool SaveFileOnCommit { get; set; }

  public string Password { get; set; }

  [System.ComponentModel.EditorBrowsable(System.ComponentModel.EditorBrowsableState.Advanced)]
  public bool KeepDatabaseOpen { get; set; }
}

This gives us the additional possibility to hide the KeepDatabaseOpen argument from IntelliSense. Keeping a created or openend database open can be useful in some (rare) cases, but not handled correctly it will create a memory leak which eventually crashes AutoCAD.

Furthermore I would completely remove the SaveAs method from the API. You're absolutely right, it makes only sense to save after the commit, which is done in the Dispose method. From this perspective, SaveAs has no place in the API.

Finally, using the database in your example would look something like this:

using (var db = AcadDatabase.OpenForWrite(dwgPath, new OpenForWriteOptions { SaveFileOnCommit = true }))
{
  DoSomethingWIthDB(db);
}

What do you think about this API? Do you think this is usable? And what do you think should be the default case for this feature? Should we maybe turn it around such that the file is saved by default and the use can opt out with DontSaveFileOnCommit = true?

Your participation and feedback is highly appreciated.

benkoshy commented 3 years ago

Why not leave the library as-is and simply do it with another operation?

// manipulate the db
using (var db = AcadDatabase.Open(dwgPath, DwgOpenMode.ReadWrite))
{
  DoSomethingWIthDB(db); 
}

// save the db
var db = getDb(dwgPath);
db.SaveAs(dwgPath);

If the database is purely in memory, then simply retain the reference?

Database db = new Database(etc.)
using (db))
{
  DoSomethingWIthDB(db); 
}

// save the db
 db.SaveAs(dwgPath);

sanzoghenzo commented 3 years ago

Hi sanzoghenzo,

great, this is a very interesting use case!

Yes, your proposed way to implement this feature is a good way to go. The internal implementation can be done like that. Maybe we can call it saveOnCommit? Because we only want to save in the case of a commit, not on each dispose.

You're right, it makes more sense.

But now we have a terrible explosion of overloads, which cries for either optional parameters or separate options classes. IMHO the cleanest way would be to have two method overloads for Create, OpenForRead and OpenForWrite: one overload for the default case and one overload with additional options. Option classes highly improve the readability of argument lists.

I started the conversation exactly for this, I feared to losing my mind around a ton of overloads 😱

Coming from 7 years of mostly python development, I tend to favor optional parameters (keyword arguments); to me it seems clearer than having overloads that only assign a default value, so that I can immediately see what are the defaults via IntelliSense. I can see the power of overloads for completely different signatures (eg parameter types that needs a pre-processing to adapt to the "core" method).

That said, option classes are good way to handle this, and the API you defined are brilliant! It is also easier to add new options if needs be.

Finally, using the database in your example would look something like this:
using (var db = AcadDatabase.OpenForWrite(dwgPath, new OpenForWriteOptions { SaveFileOnCommit = true }))
{
  DoSomethingWIthDB(db);
}
What do you think about this API? Do you think this is usable? And what do you think should be the default case for this feature? Should we maybe turn it around such that the file is saved by default and the use can opt out with DontSaveFileOnCommit = true?

Your participation and feedback is highly appreciated.

Well, the opt out version works better for me (less code to write 😉 ).

I can only speak from my use cases, but when I open drawings in read/write mode is because I want to edit the file, so I expect that my edits are committed and saved by default.

Temporary works on a read/write dwg can be useful in the Acad GUI (using the Active method), but I miss to see the usefulness in a database read or created in a "headless" manner. I'm sure there are some use cases, but I assume they are not the default.

Thanks a lot for your great in-depth analysis! Let me know if you can implement it or of you would like a help at this.

Why not leave the library as-is and simply do it with another operation?

Because it defeats the purpose of this great library, imho. since you envision a getDb method, you might as well do a

using (var db = getDb(dwgPath))
{
  using (var acdb = AcadDatabase.Use(db)
  {
    DoSomethingWithDB(acdb);
  }
  db.SaveAs(dwgPath);
}

that is boilerplate code that can be embedded into Linq2Acad (and getDB would be simply a copy of the first 3 lines of the OpenInternal method)

If the database is purely in memory, then simply retain the reference?

Nope, the using directive disposes the AcadDatabase object, so you get nothing to save, and all your edits are lost.

wtertinek commented 3 years ago

@benkoshy, thanks for your questions.

Why not leave the library as-is and simply do it with another operation?

// manipulate the db
using (var db = AcadDatabase.Open(dwgPath, DwgOpenMode.ReadWrite))
{
  DoSomethingWIthDB(db); 
}

// save the db
var db = getDb(dwgPath);
db.SaveAs(dwgPath);

I'm afraid this won't work. After leaving the using block the file is closed without saving and your changes are lost. var db = getDb(dwgPath); will give you the unchanged database. One way this could work would be something like this:

Database tmpDatabase;

// manipulate the db
using (var db = AcadDatabase.Open(dwgPath, DwgOpenMode.ReadWrite, keepOpen: true))
{
  DoSomethingWIthDB(db);
  tmpDatabase = db;
}

tmpDatabase.SaveAs(dwgPath);
tmpDatabase.CloseInput(true);

In my opinion client code is much cleaner if we integrate the discussed approach into the API:

using (var db = AcadDatabase.OpenForWrite(dwgPath, new OpenForWriteOptions { SaveFileOnCommit = true }))
{
  DoSomethingWIthDB(db);
}

If the database is purely in memory, then simply retain the reference?
Database db = new Database(etc.)
using (db))
{
  DoSomethingWIthDB(db); 
}

// save the db
 db.SaveAs(dwgPath);

The problem with this code is that the using statement disposes the db object after the using block is left and db.SaveAs(dwgPath); will throw an exception. using is syntactic sugar, the actual code looks something like this:

Database db = new Database(etc.)

try
{
  DoSomethingWIthDB(db); 
}
finally
{
  db.Dispose();
}
// --------> From here on accessing db will throw an exception

// save the db
 db.SaveAs(dwgPath);

wtertinek commented 3 years ago

Coming from 7 years of mostly python development, I tend to favor optional parameters (keyword arguments); to me it seems clearer than having overloads that only assign a default value, so that I can immediately see what are the defaults via IntelliSense. I can see the power of overloads for completely different signatures (eg parameter types that needs a pre-processing to adapt to the "core" method).

That said, option classes are good way to handle this, and the API you defined are brilliant! It is also easier to add new options if needs be.

Thanks for bringing up keyword/named arguments. I first thought to use named arguments, but after playing around with some use cases I think option classes are a better way to do it. I'd like to elaborate on my rationale behind that, maybe you have different thoughts on that (given your Python experience) and we can come up with something better.

Say we defined OpenForWrite with optional arguments like

public static AcadDatabase OpenForWrite(string fileName, string password = null, bool saveFileOnCommit = false, bool keepDatabaseOpen = false);

This would allow for a very short and clear client code:

using (var db = AcadDatabase.OpenForWrite(dwgPath, saveFileOnCommit: true))
{
}

But in my experience named arguments or not used frequently by the average C# developer. I guess many developers are not even aware that named arguments exist. One thing to keep in mind is that the vast majority of users of the AutoCAD API are developers with little or not more than average knowledge about specific C# language features. In my experience many are professionals in a specific (non-IT) domain who started coding to solve some problem they have in AutoCAD. Therefore I think most developers will go with the most common pattern, positional arguments. With positional arguments the client code will look like this:

using (var db = AcadDatabase.OpenForWrite(dwgPath, null, true))
{
}

Which still is short, but lacks readablility. If the parameter keepDatabaseOpen is used as well or if another parameter is added to the method things even get worse.

Option classes on the other hand have advantages and disadvanges as well. The main drawback is that client code gets more cluttered and more typing is required, but the advantage is that the developer's intention is made explicit and there is no room for misunderstandings:

using (var db = AcadDatabase.OpenForWrite(dwgPath, new OpenForWriteOptions { SaveFileOnCommit = true }))
{
}

No need for the "special" syntax of named arguments, the parameters are still named (with the use of properties). And, as you already said, adding parameters is easy and doesn't break anything.

BTW: Any better names than OpenForRead and OpenForWrite?

Well, the opt out version works better for me (less code to write 😉 ).

I can only speak from my use cases, but when I open drawings in read/write mode is because I want to edit the file, so I expect that my edits are committed and saved by default.

That's as very good point. Totally agree, that should be the default case. So your example would be

using (var db = AcadDatabase.OpenForWrite(dwgPath))
{
  DoSomethingWIthDB(db);
}

And someone who wants to opt out uses

using (var db = AcadDatabase.OpenForWrite(dwgPath, new OpenForWriteOptions { DoNotSaveFileOnCommit = true }))
{
  DoSomethingWIthDB(db);
}

Temporary works on a read/write dwg can be useful in the Acad GUI (using the Active method), but I miss to see the usefulness in a database read or created in a "headless" manner. I'm sure there are some use cases, but I assume they are not the default.

Totally agree. I guess DoNotSaveFileOnCommit won't be used very often, but should be there, just in case.

Thanks a lot for your great in-depth analysis! Let me know if you can implement it or of you would like a help at this.

If you want to work on this feature you are very welcome to do so!

One more thing I forgot: what do you think about eliminating SaveAs? Because the situation is the same in all cases (Active(), Use(), OpenFor*() and Create()), SaveAs() only makes sense after the commit.

sanzoghenzo commented 3 years ago

... in my experience named arguments or not used frequently by the average C# developer. I guess many developers are not even aware that named arguments exist. One thing to keep in mind is that the vast majority of users of the AutoCAD API are developers with little or not more than average knowledge about specific C# language features. In my experience many are professionals in a specific (non-IT) domain who started coding to solve some problem they have in AutoCAD.

I totally see your point. I started with VBA and python, and both have optional parameter... but they don't have overloads! (A common pattern used in python is the **kwargs parameter to handle the various possible parameters dinamically, but it totally hides the API to the user) I see that optional parameters were included in C# 4.0, in 2010. Judging by the documentation I could find around AutoCAD .NET in these years, it is mostly a child of the older AutoCAD versions, so obviously no optional parameters there...

But I digress, class parameter are totally fine for me.

BTW: Any better names than OpenForRead and OpenForWrite?

They look good to me. Maybe Read and Edit? Or they're not so self-explanatory?

If you want to work on this feature you are very welcome to do so!

Ok, I'll take a stab ASAP.

One more thing I forgot: what do you think about eliminating SaveAs? Because the situation is the same in all cases (Active(), Use(), OpenFor*() and Create()), SaveAs() only makes sense after the commit.

I didn't give you feedback because I agree to it, it was a tacit consent 😉

wtertinek commented 3 years ago

BTW: Any better names than OpenForRead and OpenForWrite?

They look good to me. Maybe Read and Edit? Or they're not so self-explanatory?

I like that, Edit is better then Write. But I would still use the Open prefix for the two, to indicate that the two methods do the same thing but in a slightly differnt way. OpenForRead and OpenForEdit would be not bad, but I think OpenForRead does not convey the "readonly-ness" which is the actual difference to OpenForEdit. What about OpenReadOnly?

And btw, I think it actually is redundant to use the parameter DoNotSaveFileOnCommit we discussed before. Calling OpenForEdit with DoNotSaveFileOnCommit set to true would be equivalent to calling OpenReadOnly. Furthermore with this change we could make one option class OpenOptions with properties Password and KeepDatabaseOpen instead of two option classes.

sanzoghenzo commented 3 years ago

BTW: Any better names than OpenForRead and OpenForWrite?

They look good to me. Maybe Read and Edit? Or they're not so self-explanatory?

I like that, Edit is better then Write. But I would still use the Open prefix for the two, to indicate that the two methods do the same thing but in a slightly differnt way. OpenForRead and OpenForEdit would be not bad, but I think OpenForRead does not convey the "readonly-ness" which is the actual difference to OpenForEdit. What about OpenReadOnly?

Sounds good!

And btw, I think it actually is redundant to use the parameter DoNotSaveFileOnCommit we discussed before. Calling OpenForEdit with DoNotSaveFileOnCommit set to true would be equivalent to calling OpenReadOnly. Furthermore with this change we could make one option class OpenOptions with properties Password and KeepDatabaseOpen instead of two option classes.

What if someone wants to do some temporary work on the db? Not sure if it's the right example, but maybe one needs to explode the blocks in the drawing to get the information of the contained entities... could it be done in the read only db? Should we expose the commitTransaction field to the public API?

Also, we are missing the ability to save as another file; this should be an option in the class for OpenForEdit only, with the default being the given fileName.

wtertinek commented 3 years ago

What if someone wants to do some temporary work on the db? Not sure if it's the right example, but maybe one needs to explode the blocks in the drawing to get the information of the contained entities... could it be done in the read only db? Should we expose the commitTransaction field to the public API?

Very good point. Doing temporary work is always possible. In the readonly and also in the read/write case the current implemenation reads the database from the file into a database object in memory (see OpenInternal). You can do whatever you want with the database, you can also commit the transaction, the change never go back into the file, because the connection to the file is closed via CloseInput (line 865). And this is actually is incorrect in the current implementation, DwgOpenMode actually has no effect, beacuse whether the file is opened readonly or read/write, it is closed immediately after reading all data.

So I think the right way to open the file in our new implementation would be

OpenReadOnly:

var database = new Database(false, true);
database.ReadDwgFile(fileName, FileOpenMode.OpenForReadAndReadShare, false, password);
database.CloseInput(true);

OpenForEdit:

var database = new Database(false, true);
database.ReadDwgFile(fileName, FileOpenMode.OpenForReadAndWriteNoShare, false, password);
// Close the file after commit

The only question is what happens to changes in the database in the case of OpenForEdit? Are they saved automatically when calling database.CloseInput? Or do we still need database.SaveAs? We should investigate this, I haven't tried that out yet.

Also, we are missing the ability to save as another file; this should be an option in the class for OpenForEdit only, with the default being the given fileName.

Good point. Than we should again make two separate option classes and call the property maybe SaveAsFileName?

wtertinek commented 3 years ago

So I think the right way to open the file in our new implementation would be

OpenReadOnly:
var database = new Database(false, true);
database.ReadDwgFile(fileName, FileOpenMode.OpenForReadAndReadShare, false, password);
database.CloseInput(true);

I just found in the documentation that by default the database is lazy-loaded and CloseInput forces "an immediate read of all necessary information from the file into the Database object and disconnects the file from the Database". So we could probably gain a performance win if the file is lazy-loaded. If I understand the documentation correctly CloseInput is not absolutely necessary to correctly handle the dwg file access, it just reads all data and closes the connection to the file. I think there's some experimentation needed.

The only question is what happens to changes in the database in the case of OpenForEdit? Are they saved automatically when calling database.CloseInput? Or do we still need database.SaveAs? We should investigate this, I haven't tried that out yet.

Found the answer here in the AutoCAD forum, SaveAs is still needed.

Also, we are missing the ability to save as another file; this should be an option in the class for OpenForEdit only, with the default being the given fileName.

Good point. Than we should again make two separate option classes and call the property maybe SaveAsFileName?

One furher option we could/should add here is to provide the version under which the file should be saved. In the current SaveAs implementation I used DwgVersion.Newest which is a bad idea, DwgVersion.Current would be best as the default. But I couldn't find out what "Current" really means. Is it "keep the current version of the file" or is it "save it as the current AutoCAD version"?

EDIT: Just found out that Current, Newest and AC1032 (which is AutoCAD 2018) are the same.

So the OpenForEditOptions class could have one more parameter DwgVersion but I would strongly recommend to not reuse the enum Autodesk.AutoCAD.DatabaseServices.DwgVersion. It holds cryptic enum values that are meaningless to an average API user. We could define something like this (but should find a better name for the enum):

public enum SaveAsDwgVersion
{
  None,
  AutoCad2004,
  AutoCad2007,
  AutoCad2010,
  AutoCad2013,
  AutoCad2018,
  DontChange, // Better than Current? Or Keep? EDIT: We can use Database.LastSavedAsVersion for that
  NewestAvailable // We could remove this one, since the newest available is AutoCad20xx wit hthe higehst number
}

and could do the conversion to Autodesk.AutoCAD.DatabaseServices.DwgVersion internally.

This would cover a new use case, converting DWG files to the newest version:

foreach (var fileName in Directory.GetFiles(sourceFolder))
{
  using (AcadDatabase.OpenForEdit(fileName,
                                  new OpenForEditOptions
                                  {
                                    SaveAsFileName = Path.Combine(targetFolder, Path.GetFileName(fileName)),
                                    DwgVersion = SaveAsDwgVersion.NewestAvailable
                                  }))
  {
  }
}

sanzoghenzo commented 3 years ago

I just found in the documentation. [...]

I was crafting an answer to your points reporting more or less the same info (but from different sources), but I was too slow!

Also, we are missing the ability to save as another file; this should be an option in the class for OpenForEdit only, with the default being the given fileName.

Good point. Than we should again make two separate option classes and call the property maybe SaveAsFileName?

One furher option we could/should add here is to provide the version under which the file should be saved. In the current SaveAs implementation I used DwgVersion.Newest which is a bad idea, DwgVersion.Current would be best as the default. But I couldn't find out what "Current" really means. Is it "keep the current version of the file" or is it "save it as the current AutoCAD version"?

Never understood it myself. For one automation I wrote some time ago, I ended up hard-coding the version needed by my colleague...

So the OpenForEditOptions class could have one more parameter DwgVersion but I would strongly recommend to not reuse the enum Autodesk.AutoCAD.DatabaseServices.DwgVersion. It holds cryptic enum values that are meaningless to an average API user. We could define something like this (but should find a better name):
public enum SaveAsDwgVersion
{
  None,
  AutoCad2004,
  AutoCad2007,
  AutoCad2010,
  AutoCad2013,
  AutoCad2018,
  DontChange, // Better than Current? Or Keep?
  NewestAvailable
}
and could do the conversion to Autodesk.AutoCAD.DatabaseServices.DwgVersion internally.

OK, some thoughts:

Wikipedia refers to them as "DWG 2004", "DWG 2007" and so on, should we use that?
That is the meaning of None?
I like DontChange better, more meaningful;
I assume NewestAvailable refers to the running AutoCAD version?
How to deal with a selected version that is more recent than the running AutoCAD version?

This would cover a new use case, converting DWG files to the newest version:

This is something that can be already done with TrueView (or was it RealDWG? Too many Autodesk tools...), but it's a good idea to have if you need to combine it with other actions!

wtertinek commented 3 years ago

OK, some thoughts:

Wikipedia refers to them as "DWG 2004", "DWG 2007" and so on, should we use that?

Yeah, that's better 👍

* That is the meaning of `None`?

There's an analyzer that says that enums should have a zero-value (CA1008), such that you can have an uninitialized variable:

var version = SaveAsDwgVersion.None; // Uninitialized

* I assume `NewestAvailable` refers to the running AutoCAD version?
Sorry, I made some edits on my previous comment (forgot some things), maybe we should skip NewestAvailable. I think the explicit versions and DontChange are enough (and not confusing).
* How to deal with a selected version that is more recent than the running AutoCAD version?

I would say we solve this via conditional compilation. The C# projects for the specific AutoCAD versions have a preprocessor flag like AutoCAD_2021, so we flag DWG2018 with AutoCAD_2018 to AutoCAD_2021.

sanzoghenzo commented 3 years ago

I always forget to mention the issue in my commits to close it automatically 😅 I'm glad I was able to help this great project!

wtertinek commented 3 years ago

Thanks for your contribution! If you have more ideas on how we can improve Linq2Acad don't hesitate to open an issue or pr or write something in the discussions area. Working with you was fun and very productive! Thx!