danmunn / redmine_dmsf

Fork of svn repository for redmine_dmsf
GNU General Public License v2.0
420 stars 195 forks source link

Update Documents Links in wiki and issues pointing to new dms folder #35

Open cforce opened 12 years ago

cforce commented 12 years ago

Enhancement for rake to Document to dmsf conversion https://github.com/danmunn/redmine_dmsf/wiki/Migrating-from-Documents-to-DMSF

jniggemann commented 12 years ago

Terence, I'm not sure I understand. You have links in your wiki and issues that point to a DMSF folder and you'd like to bulk update all of then to point to a new folder instead?

danmunn commented 12 years ago

Terence,

Can you clarify what's being asked for or hinted at in this issue/request?

cforce commented 12 years ago

I want the conversion task can convert following redmine document markup links (in wiki sites or issues) into new dmsf syntax after documents moved to dmsf

Documents: document#17 (link to document with id 17) document:Greetings (link to the document with title "Greetings") document:"Some document" (double quotes can be used when document title contains spaces) sandbox:document:"Some document" (link to a document with title "Some document" in other project "sandbox")

danmunn commented 12 years ago

Although I can see an advantage to this, I'm not sure making replacements on document contents is necessarily the best idea in practice. If there was a false positive we could end up mis-linking documents. Not only the overhead in queries that would be needed to iterate through entries making replacements for each moved file.

Ideally for each migrated entry, we'd need to re-process the text of each entry and identify (best possible) matches and replace as we go - however this would not be very reliable. Are you aware of any other plug-ins with similar functionality?

cforce commented 12 years ago

Well, we can't migrate from documents to dmsf if old links won't be migrated to new links. Else we get dead link everhwre in wiki sites and issues.

Migration is not fully possible then.

danmunn commented 12 years ago

We can look at alternatives however an automated process that sifts through texts in a potentially destructive way is not necessarily the best practice; We would need to be iterating through each issue, subsequent responses to that, wiki pages etc... there is going to be a large database overhead to this

cforce commented 12 years ago

The overhead is technical the same doing it manually, and having an automatic way doing it is less work for the user can use it. Why you you want do do it destructive? I think the search replace patters of mysql can do it very fast, you really underrate mysql performance and by the way i think most users won't have GB of database, i mean its just text and not blobs in there.

danmunn commented 12 years ago

To be honest I need to go through the task anyway, as I'm overhauling chunks left right and centre at the moment with the progression to the 1.5.0 release. That being said, the way that the existing task is done is to literally iterate through everything and individually as entities are found new dmsf entities are created. If I was to extend that approach, we'd be in a situation where (no of wiki pages + no of issues + no of replies) * no of converted documents * 2 queries would be run. If a person has 2-300 files, 30 wiki pages, 4-500 issues with an average number of replies being lets say 3 thats effectively: (30+500+(5003)) * 300 \ 2 = 1218000 queries (on a 1:1 read/write) Thats obviously not taking into account queries for the actual file entities themselves.

If I were to completely re-factor the code and implement a level of data caching for a mass update, instead we'd see (30 + 500 + (500 * 3)) * 2 = 4060 queries (again on a 1:1 read/write).

P.S. these figures are literally for demonstrating a thought, don't take them as anything beyond that - as they are neither inclusive or based off of any test etc.

To address the destructive comment above: Any hard change in data that is not reversible could be considered destructive, as it can not be re-constituted to something meaningful - that is what I meant to imply. If there was an error in the conversion process we could end up damaging content which would be less than desirable.

jniggemann commented 12 years ago

Can't we write the old and new references to a temp table and then use update [table_name] set [field_name] = replace([field_name],'[string_to_find]','[string_to_replace]');

danmunn commented 12 years ago

Jan I had thought of that, however we'd pobaby need to run it in variation for each of the naming conventions. It was considered safer for code-wise execution on the grounds that we could end up replacing incorrect references (although unlikely) http://www.google.co.uk/document#13 would for example be hit when it wouldn't by redmine's processing regex.