Data source locking methods should be optional

facebookarchive / AsyncDisplayKit

Smooth asynchronous user interfaces for iOS apps.

http://asyncdisplaykit.org

Other

13.4k stars 2.2k forks source link

Data source locking methods should be optional #338

Closed cfcommando closed 8 years ago

cfcommando commented 9 years ago

They're only called if _asyncDataFetchingEnabled == YES, which is off by default, but still adds the missing implementation warnings. Perhaps rather than using a flag and requiring the methods, you could split the protocol into ASTableViewDataSource and extend it with ASTableViewAsyncFetchingDataSource, which would include the additional methods.

secretiverhyme commented 9 years ago

cc @tanliboy, @rnystrom, #303.

The data source locking methods are supposed to be called even when async data fetching is disabled, which is why @tanliboy wanted to make them a required part of the async data source protocol. If that's not the case, then we have a bug on our hands.

More generally, I think it would be worth revisiting this API before 1.2 ships. On the one hand, these lock/unlock methods are necessary if you want to avoid expensive work happening on the main thread, and because AsyncDisplayKit is all about unblocking the main thread, this feature is clearly necessary. On the other, it's not immediately obvious what the advantages and disadvantages of enabling "async data fetching" are, and we could expand on how the data source locking methods work.

tanliboy commented 9 years ago

Yes, it should be called even we fetch data in sync mode. Because we could call the editing method on ASTableView/ASCollection is thread safe and user may call it from any thread, the lock/unlock is still a useful/valid indicator for updating data source. I just send https://github.com/facebook/AsyncDisplayKit/pull/340 to fix it.

secretiverhyme commented 9 years ago

Merged #340, but keeping this issue open for discussion. @tanliboy, can you sum up the pros and cons of async data fetching mode? In what situations would you want to use synchronous data fetching?

cfcommando commented 9 years ago

I think this API needs to be rethought.

First off, it's a significant departure from a conventional UI{Table,Collection}View if calling lock methods is required for normal (synchronous) usage. The ASDK variants are intended to map conceptually to their UIKit variants, but it's unclear from inspection how the locking methods are intended to interact with the table/collection views, and I don't have a clear understanding of how they're used.

Secondly, developers can and will leave them unimplemented. Even those engineers who understand locking (and developers often don't) may not understand the necessity of the locking API, because they don't have in-depth knowledge of the AS{Table,Collection}View implementation. Even as an experienced user of ASDK I no longer have a firm grasp on what the intended pattern is. As the first and most useful and influential classes to new adopters of ASDK, the newly-required API for AS{Table,Collection}ViewDataSources, in my opinion, imposes an unacceptable barrier to usage that I think will prove a detriment to its adoption.

Thirdly, I stream updates through a data pipeline built around Promises. It's not clear what I need to do to continue streaming those updates with this API, other than buffer updates (hard enough by itself given my architecture is as close to stateless as I can make it) and add a layer of state tracking on top which blocks a queue on waiting for the table to be unlocked - which is further complicated by the fact that the methods may or may not be called on the same thread depending on how the table/collection view is set up. I may be missing something, but the threading behavior I need to make the most of the table/collection views is no longer clear.

Basically, adding this API will require significant additional work to make use of AS{Table,Collection}View, both in terms of developer education and application infrastructure around it.

cfcommando commented 9 years ago

Also, I asynchronously load new data myself in large batches (as the API loads lists based on numerical ranges rather than per-element object ids); using a percentage of the scroll position is easy enough to understand, but it's unclear how the intelligent network preloading range described in @appleguy 's presentations is related to this API.

tanliboy commented 9 years ago

Thanks all for discussion ! On second thought, I agree the API need to be refined or simplified to avoid confusing developers.

The lock/unlock function call is introduced in https://github.com/facebook/AsyncDisplayKit/pull/303, and the intention was to avoid main thread allocation for ASCellNode. In the previous implementation of ASTableView/ASCollection, we call the API of data source from calling thread to fetch data and keep data consistence. It works fine for data editing/updating, because these API of ASTable/ASCollectionView is thread safe, and the developers could always call it from background thread directly. However, for the initialization of ASTableView/ASCollection with large scale data, the reloadData of ASTableView/ASCollectionView will be invoked from main thread, and we hence fetch the data from main thread. It could possibly block the main thread once we create the ASCellNode in the data fetching function call.

Due to this reason, we got some issues on this problem, and we don't have simple way to fix it. Finally, we come out with the solution to resolve it using the lock/unlock function call to put the data fetching in background thread. It passes the responsibility to keep data consistence to client with lock/unlock data source. We call it asyncDataFetch. In the async data fetching, we will lock it from calling thread, and unlocked it from background thread once we have done the data fetching.

Pro: It could avoid blocking main thread in the initialization of ASTableView/ASCollection with large scale data. Con: It required the developer to lock & unlock thread although it is not hard to implement it.

Considering we only have medium scale data in the most use case, so we also keep the asyncDataFetch to be optional, and turn it off by default.

Optional & Required

In the first version of the PR, I made it to be optional and added runtime assertion for the case that we turned on the asyncDataFetch. Later, I feel it is confusing to add partial assertion to check the implementation of data source, and updated it to be required with the argument -- it is a resonable/useful function call both for async/sync data fetching.

To avoid confusing the developers, we could hide the lock/unlock for the typical user case. I will try to figure out other way to both keep compilation check for asyncDataFetching and remove the lock/unlock function for syncDataFetching. If there is no better solution, we could fall back to the runtime assertion to checking lock/unlock implementation in asyncDataFetching case.

cfcommando commented 9 years ago

If I'm reading this right, the lock/unlock API requires the data fetch to be synchronous, safe to call from a background thread, and fetching state must be managed by the table/collection views. I don't think that pattern is very general, and it doesn't fit with my usage, where I would prefer some sort of delegate method (or callback) letting me know of a range that is ready for fetches, and allow me to trigger network fetches and notify the collection of updates. I perform all of my data updates between beginUpdates/endUpdates or the UICollectionView equivalent to make sure my changes are synchronized with my data updating.

We need to more concretely explain and validate the approaches you're suggesting, IMO. Such a thing would be necessary to develop before tagging a 1.2 release, as this is likely to be a noteworthy feature of that release.

Do you think it would help to have a sample app built against the API to demonstrate the intended usage pattern and clear up the confusion? I may be completely misunderstanding the point of the locking, given that I lack much context into the rest of the async data fetch infrastructure; I don't really have a clear idea of the failure case this work is intended to prevent.

smeis commented 9 years ago

@tanliboy and @secretiverhyme could you perhaps share the current plans for the future of the locking mechanism and the asynchronous fetching? I've been using it since it was released and I'm very happy with the results, though I of course fully realize that we need to take in account as much use cases as possible.

While reading this discussion the following idea popped up in my head.

In light of the idea of creating ASDK components as drop in replacements for their UIKit counterparts, wouldn't it make things easier if we make ASCollectionView/ASTableView in sync fetching mode more like their UIKit counterparts? In the current implementation it's fine to call reloadData or the editing methods from a background queue, but as @cfcommando already pointed it this is already a departure from the way the UIKit components work. And I think that makes the use of ASDK more complex and increases the risk of errors because methods like nodeForItemAtIndexPath: can be called from any queue, and the implementation of that method needs to take that into account. If you'd have the ASDK components work on the main thread when they're in sync mode they would be a lot more like their UIKit counterparts.

I realize that this might go against the goals of ASDK but in my opinion it would make things more logical. That way you can just use it the same way you used the UIKit components and in this scenario you would still benefit from the preload ranges ASDK uses.

This would, in my opinion, also justify a more complex API for when you're using asynchronous fetching. You can use the ASDK components the same way you'd use the UIKit components and still benefit from some of the ASDK goodies but if you want the whole package you need to take that into account with your data management.

That's my 2 cents, wondering how you guys see this.

By the way I would also be more than willing to do some writing for the guides or create a sample project to make the asynchronous fetching API more accessible to other developers, once you make a decision about this part of the API. I was already thinking about putting the code of my data controllers (that replace nsfetchedresultscontroller and now have code for locking) on github anyway to discuss best approaches to the locking mechanism and help other developers.

ryanfitz commented 9 years ago

My 2 cents as a user of ASDK. I have an app where one of the screens cannot be implemented smoothly using standard tableview / collectionview. It deals with a large data set, relatively complex cells and needs to support pull to refresh and infinite scroll. Even with ASTableView before async data fetching, the ui at times would be jumpy while scrolling / loading more data.

I recently switched over to running off master and using async data fetching. With asyncDataFetchingEnabled I'm able to run all my operations on background threads and the ui works very nicely. I've found handling data source locking / unlocking to be very simple and straightforward. I have a generic datasource which I would gladly open source to help out other developers, it would probably work for the majority of peoples use cases.

The biggest bug I've found so far with data locking is #369. I'm sure the current locking implementation can be simplified, but I vote to keep the ability of async data fetching for those cases that require it.

appleguy commented 8 years ago

@cfcommando @ryanfitz @smeis Really happy to say that this has finally been done in a much better way.

https://github.com/facebook/AsyncDisplayKit/pull/1173

The asyncDataFetching option has been disabled, then deprecated for quite some time because of this problem. So it hasn't been hurting folks. Nodes are still pretty cheap to allocate, and so only complex apps or apps that add fairly expensive custom logic to init (perhaps data detectors or other things) will find this is essential, but it is so easy to adopt that there is no reason not to.

It allows us to concurrently allocate the cell nodes with the "nodeBlock" providing a simple and easy way to capture any required data model to initialize the node with. It is up to the implementing engineer to realize they should not simply capture the indexPath and access their datasource within the block, but rather access the datasource and provide the result as the object to capture in the block.

In Pinterest, which has many complex views, this change made a very substantial difference. It has proven stable so far, but is only days old and has not yet launched in a stable release (1.9.7 should include it in a couple weeks).

Please give it a try and file any issues you encounter, or send in feature requests. Thanks again for your feedback and sorry this one took a while!