Closed mvphelps closed 1 month ago
@dotnet-policy-service agree
TestLoadJIT_3 looks like it has a flaky history. Want me to try some troubleshooting? Do it here or in another PR? Thanks.
TestLoadJIT_3 looks like it has a flaky history. Want me to try some troubleshooting? Do it here or in another PR? Thanks.
don't worry. it also fails in other PRs https://github.com/dotnet/TorchSharp/pull/1383#issuecomment-2417021097
By the way would it be a better idea to modify PackedSequence
itself instead of modifying the dispose scope system?
Those tensors are created in PackedSequence
's constructor and the constructor is called under the scope. I think that's why they will be disposed. Maybe easily detaching them from the scope could solve the problem?
(Just a suggestion. Maybe I have missed something.)
@yueyinqiu Thank you for such a quick response. I certainly don't see this PR as final. I will ignore the JIT test going forward thank you.
I did consider that option - only modifying the PackedSeqeunce object, and I conceded it would work. My concern with this is that a PackedSequence is made of tensors. They can be detached, but then the user still needs to manage disposing the PackedSequence in order to dispose those tensors. This is the same as saying the user must rely on only techniques 1 and 2 as discussed in the Memory Management document. This would also not follow existing patterns already in the code base. For example, DataLoader uses scope to manage any temporaries created in GetTensor an collate, see https://github.com/dotnet/TorchSharp/blob/1dd3ae507b0881bfb41c00c185438c7884f6a59f/src/TorchSharp/DataLoader.cs#L434. This is a good pattern as it aleves the user of managing any temporaries.
The opposite question - why should PackedSequence not be on par with the memory management treatment of Tensor? I think it has all of the same use cases as far as lifetime goes.
My particular use case, my sequence data won't all fit in RAM all at once. So I'll use a custom data loader (based on the built-in one) that deals with PackedSequence instead of Tensor. I'd like to keep with that implementation as it does manage the temporaries via scope. This does necessitate the ability to move the PackedSequence to the outer dispose scope after it is created.
Are there specific concerns with modifying the DisposeScope object? I certainly can agree there would be a slight performance penalty due to the type checking. It does also represent a rather ugly bit of code duplication. That's why I mentioned an interface - that way these objects can be treated identically under the scope system, and there would be no branches or performance penalty. And now that I look I see that I missed a check on IsInvalid in DisposeScope (line 246) - that's what I get for creating the duplication!
I think the best solution is to implement a common interface on both Tensor and PackedSequence, and have DisposeScope only accept that. It can then treat these two types consistently, with no performance penalty, no code duplication, all while not foisting on to the user the responsibility of passing around disposables to manage. The interface would be pretty simple: IUnmanagedTorchReference{ OwningDisposeScope {get; set;} IsInvalid {get;}} (better name suggestions welcome!)
What problems/concerns do you see with this approach? I am open to options here. Note I implemented as I did for this PR as a middle road solution to start the conversation and get my idea communicated, without going so far as introducing a new type. Just trying to tread lightly as I meet the community here. Thanks again!
Wow! Really appreciate your detailed explanation. I got your point.
A common interface would be a great idea, which could even allow users to manage their objects with our dispose scopes. (Currently it is possible to attach a IDisposable
, but there isn't a automatically changed OwningDisposeScope
. And we could add extension methods on it. Really nice.)
By the way, IsInvalid
seems to be used for statistics purpose (for unit test) alone. Is it really important?
@NiklasGustafsson
Great thank you. Added a new commit with what I'm thinking. I did do this one as what I consider release quality. Feel free to shred it @yueyinqiu, @NiklasGustafsson !
The new interface is
IDisposeScopeClient: IDisposable
DisposeScope? OwningDisposeScope
bool IsInvalid
As per your question, is IsInvalid important? Yes:
In DisposeScope, everything that used IDisposable now uses IDisposeScopeClient. I tried leaving some of them (like the DisposablesView), but it was resulting in casting that looked out of place. Additionally I've removed the branches I initially added for PackedSequence, and retained the original code that worked with Tensor exactly as it was - though without the pattern match. It looks much cleaner this way imho.
I've removed my tests from TestNNUtils and instead introduced two new test files, one specifically for PackedSequence with DisposeScope, and another just for custom user objects using DisposeScope. All tests are passing.
About a minor limitation with a consumer implementing IDisposeScopeClient. They will not be able to register their object with the scope system directly as Tensor and PackedSequence do, since those are calling the internal DisposeScopeManager.ThreadSingleton
. I thought exposing that was going too far. However, the user can still use scope.Attach, and that is reflected in the unit test for this. I also included instruction about this in the doc comments on IDisposeScopeClient. This keeps the API surface area smaller while still supplying the user what they need.
DataLoader also received the IDisposeScope to IDisposeScopeClient conversion.
I am aware the contribution guide says I also need to update release notes, but I haven't yet until I get more feedback on whether you find this contibution worthy of including. Also the release notes don't reference a future version, so I'm not sure how to add that. Do I just assume one version ahead and add a header?
@mvphelps, this is not a trivial update, so I'll have to take a look at it and ponder it, especially the comments about avoiding type checks in the discussion above.
Also the release notes don't reference a future version, so I'm not sure how to add that. Do I just assume one version ahead and add a header?
Yes, and update the version number in build/BranchInfo.props. Somebody already did the former in another PR, so if you could do that in this PR, that would be awesome.
Hmm... not (yet) sold on the separate interface.
Maybe it's worth breaking this up into two PRs -- one to add PackedSequence to DisposeScope-managed types, and one to then introduce IDisposeScopeClient (which is an odd name, by the way).
@NiklasGustafsson agree with being careful here. I have questions:
By "one to add PackedSequence to DisposeScope-managed types", do you mean something like my prior commit
Yes, exactly.
What specific concerns do you have RE the separate interface?
I want us to be careful about the API surface area, introducing new interfaces, etc. Let's consider that separately, since it's not just about getting support for PackedSequence, which I think is an obvious good.
I think the name is accurately descriptive, but it is not aesthetic.
Client/Server has too many connotations associated with distributed systems, so I think that word choice will be confusing or just seem odd (as it did to me).
I'm not clear on your release notes response.
You asked about the right way to do it -- bump the patch version and add new record(s). This time, someone already beat you to it, but didn't edit BranchInfo.props. So, do that in your PR, please.
@NiklasGustafsson Reverted the interface, fixed a bug I had, and made IsInvalid on PackedSequence internal. This did necessitate validating it via a reflection call in my test (see TestDisposeScopesPackedSequence), but I thought it worth it to avoid exposing this in the API for now. I also thought the reflection call better than using the InternalsVisibleTo attribute. Also updated the release notes.
Additionally made a test to verify the statistics are updating correctly and have identified a subtle possible bug. In DisposeScope.Attach(IEnumerable
Just added one more took a walk and realized my implementation was dumb - as a DisposeScope was managing 5 objects for each PackeSequence. This update just detaches all the internal tensors and lets the PackedSequence lifetime manage them since it is already IDisposable.
Never Ignore... :-)
Easiest to track in separate bug, fix it later, since it has no functional impact and affects other APIs, too.
Shold 'IsInvalid' be 'IsValid' instead? It seems unusual to have express it in terms of invalidity instead of validity. On Tensor, it's 'IsValid,' I believe.
Ok I'll make an issue for that.
I don't like IsInvalid either. I was maintaining consistency with Tensor, which uses IsInvalid: https://github.com/dotnet/TorchSharp/blob/538f2e3a817ac02b5bcc35f1bdc8acbc2af08ba8/src/TorchSharp/Tensor/Tensor.cs#L99
@NiklasGustafsson Shall I update PackedSequence to do IsValid, or leave it to maintain consistency? We also can't change Tensor since that is long standing in the API. Unless we want to [Obsolete] it.
Well, I stand corrected. Leave it 'IsInvalid' for consistency.
Ready to merge?
I do not have any remaining concerns. If you think this is ready, let's merge it! :1st_place_medal:
@NiklasGustafsson color me amazed! :astonished: I never expected to see this merged in just a day! I appreciate your quick responses and insights. I am very impatient (working on it) and you didn't make me feel like I was waiting. I look forward to contributing more in the future. Starting with the statistic issue next week. THANK YOU!!!!
Let's see how the CI/CD pipeline goes. This morning, there was some problem with signing packages.
@NiklasGustafsson You are right it failed on signing again - 500 internal server error. Is there any way I can contribute to help move this along? I'm assuming not as it's likely something only maintainers have access for, but my help is available if needed.
I've asked for internal help, and still waiting. The error isn't easy to diagnose, exactly. There must be some log, somewhere, that can be consulted.
Apparently, the cert used to sign has expired. Getting a new one.
Success!
Nice! When will it show up on Nuget? I don't see it available - I've even enabled prerelease packages. Does it just take a while?
Also it looks like maybe the build skipped the publishing part? See https://dotnet.visualstudio.com/TorchSharp/_build/results?buildId=110780&view=results, the push*packages targets appear all skipped.
The build status on the ReadMe page still shows v0.103.0. I don't think the build went through. It did some signing but it hasn't published any packages.
That's because I manually publish the packages when there's a critical mass of PRs that have gone into it.
Can now create a PackedSequence within a NewDisposeScope, and the .MoveToOther, .MoveToOuter, etc. Also can use disposeScope.Attach(PackedSequence).
Looking for feedback if we want a common interface to share between Tensor and PackedSequence. This would eliminate a bunch of type checks. I followed the existing pattern for this initial commit. This is my first PR here, LMK if I'm not conforming to repo norms. Thank you.