Open orchardbot opened 12 years ago
AimOrchard commented:
In addition, I'd like to ask if you could tell me what I need to do (sql-wise) to remove all users except admin.
I want to get rid of our test users so we can (for now) use the import/export function until a valid solution / fix is found for this issue.
AimOrchard commented:
Ok, export is also terrible...
var contentItems = _orchardServices.ContentManager.Query(options).List();
You fetch ALL content and THEN exclude the types you don't need... I have in my test database ~6k users... Meaning ALL of them are fetched, including all their linked stuff (like roles)
To export ~4-5 valid items, it took 6102 queries!
AimOrchard commented:
Ok I tried to fix it myself, I failed... Could you please look asap at this? I mean, with this bug this feature is pointless if you have any good amount of content (users or 'real' content)!
AimOrchard commented:
Ok digged further and I found a way to improve the query (so it 'only' queries ALL 'relevant' items, still bad if you have plenty of the requested content imho)
In the ImportExportService ExportData method I now have this:
private XElement ExportData(IEnumerable<string> contentTypes, VersionHistoryOptions versionHistoryOptions) {
var data = new XElement("Data");
var options = GetContentExportVersionOptions(versionHistoryOptions);
var contentTypesArray = contentTypes.ToArray();
var contentItems = _orchardServices.ContentManager.Query(options, contentTypesArray).List();
foreach (var contentType in contentTypesArray) {
var type = contentType;
var items = contentItems.Where(i => i.ContentType == type);
foreach (var contentItem in items) {
var contentItemElement = ExportContentItem(contentItem);
if (contentItemElement != null)
data.Add(contentItemElement);
}
}
return data;
}
Notice that I now supply 'Query' with the list of requested content types so that it can do the filtering @ database-side. It needs some cleaning up, but with this export was instant now without all those thousands of queries.
Issue with import still remains though, investigating bit further but no promises.
AimOrchard commented:
So yeah, the problem lies in ImportContentSession.Get since it looks like that it goes (can go?) through all content items.
AimOrchard commented:
So any feedback on this?
@bleroy commented:
If I understand it correctly the problem is that we need to compare identities with all content items when importing, to find if the item being imported exists already. Identity being a collaborative process that we cannot make assumptions on, there isn't a good solution to this problem that we know of at this point. Harder problem than it seems.
@bleroy commented:
Thanks for the suggestion though, I think we should reevaluate that and at least apply some strategic optimization that do some partial filtering ahead of time. Re-opening for new triage.
AimOrchard commented:
Well, some database-side filtering would be nice to start with
Bbut in addition to that, if you cannot escape the fact that all the content HAS to be retrieved, it would be nice to split all required actions up in batches.
You could split up both import / export into batches and give the admin a visualization of the current progress and the ability to cancel (and if doable, the ability to pause / resume a batch)
In addition to that, you could already use the improvement I mentioned with exporting content (only query for content items of the requested type)
Another thing you could add is the ability to 'skip' the 'does content exist' check and just import as-is.
No point in checking all content items if the person doing the import 'knows' that none of the content that is being imported already exists.
@bleroy commented:
Sure would be nice (except for the part about skipping identity verification).
AimOrchard created: https://orchard.codeplex.com/workitem/18779
2 issues
I made a query, exported all queries, removed the query, imported the exported data, and the query didn't reappear.
But the biggest issue is that the import is not usable if you have plenty of users.
I have 1000 users in my test database, and with just this, more than 4000! queries were executed before I had to pause the profiler because the one I use cannot handle this amount of queries.
You can see a stack trace of one of the queries (that is executed per user for some reason) @
https://dl.dropbox.com/u/23877279/Permanent/Screenshots/Bugs/import_fail_1.png https://dl.dropbox.com/u/23877279/Permanent/Screenshots/Bugs/import_fail_2.png
Both are also included in the attached zip (import_fail.zip)