ZOO-Project / migration

The place where the migration of the ZOO-Project.org SVN/trac system to Github.com will be done
2 stars 0 forks source link

Migrate Trac Wiki pages to GitHub Wiki #5

Closed jmckenna closed 3 years ago

jmckenna commented 3 years ago
gfenoy commented 3 years ago

For the wiki migration the method explained here may be investigated.

It basically consist in dumping the wiki from Trac using the following command:

trac-admin /home/trac/suivi-zoo-project/trac/ wiki dump ./

Then clone the target repository wiki locally, add you file as-is and you should get the wiki page available as soon as you push back your pages to the GitHub wiki.

jmckenna commented 3 years ago

Nice find. I agree, sounds like it would be worth a try.

gfenoy commented 3 years ago

I have made a first tentative to import the Wiki pages into ZOO-Project-svn2git repository.

You can see the corresponding Wiki pages from here.

We have a total of 301 Wiki pages. Some of the wiki pages are not required to be migrated: all the Trac specific pages and all the pages used to develop the current ZOO-Project.org web site which relies on the Trac Wiki using TracXMLRPC extension to access the Wiki page raw content and build up web pages out of WPS Execute requests.

Total pages not to be kept:

In consequence, 110 wiki pages have to be kept in the GitHub wiki.

Summarize what are the Trac Wiki macro not usable out of the box:

The wiki pages provided by trac-admin dump leads to flat wiki page name, meaning that a trac wiki page named PSC/Meetings we will have a GitHub wiki page named PSC%2FMeetings.

The wiki images are not currently imported in GitHub wiki. Note that for a WIki page A with attachment, there is an option to use the following link: http://zoo-project.org/trac/zip-attachment/wiki/A/ to download a zip archive containing all the attached files (i.e. http://zoo-project.org/trac/wiki/ZooWorkshop/FOSS4GJapan/BuildingWPSClientUsingOL/ and http://zoo-project.org/trac/zip-attachment/wiki/ZooWorkshop/FOSS4GJapan/BuildingWPSClientUsingOL/).

jmckenna commented 3 years ago

I think when I created this ticket, I was thinking in the normal project-sense (where a project is on Trac/SVN, their source code is stored there, they may/usually have a separate repository for html/website stuff, and then a few wiki pages enhance what is missing on the main html/website). Here, I think my ticket was confusing, because now I realize that ZOO-Project had every single html/website page stored as a wiki page, correct? In that case, those html/website pages should not be imported to Github wiki, only the wiki pages such as Code Sprint agendas, Meeting agendas (things not part of an official website), should be moved to Github wiki.

Example: having a visible 'Download" wiki page (imported from Trac wiki), and then having our official website html download page (likely a Sphinx restructured text file, stored in a separate repository) would become an absolute disaster, for users and maintainers. So, this was my confusion, when I initially said that all wiki pages should be imported - well in the normal project-sense yes, but not in the ZOO-Project sense where all html pages were in fact a Trac wiki page).

Sorry for my misunderstanding. Anyway we can discuss this in the meeting tomorrow.

gfenoy commented 3 years ago

Starting from ZOO-Project-svn2git.wiki repository that was created initially, I tried the following update to fix the issue by converting the Trac Wiki into MediaWiki syntaxe.

First I removed the unneeded Wiki pages:

git rm ZooWebSite* Trac* Inter* CamelCase* SandBox* Page* WikiD* WikiF* WikiM* WikiN* WikiP* WikiR* Web*

Running the following command help creating subdirectories and move the wiki pages in the correct place:


for i in $(ls | grep %2) ; 
do 
  j=$(echo $i | cut -d'%' -f1);
  k=$(echo $i | cut -d'%' -f2 | grep -v wiki | sed "s:2F::");
  l=$(echo $i | cut -d'%' -f3 | grep -v wiki | sed "s:2F::");
  m=$(echo $i | cut -d'%' -f4 | grep wiki | sed "s:2F::");
  n=$(echo $i | cut -d'%' -f3 | grep wiki | sed "s:2F::");
  o=$(echo $i | cut -d'%' -f2 | grep wiki | sed "s:2F::");
  echo $i $j $k $l $n $o $p;
  if [ -z "$l" ] ; 
  then 
    if [ -z "$k" ] ;
    then 
      mkdir $j;
      git mv $i $j/$o ;
    else
      mkdir $j/$k;
      git mv $i $j/$k/$n ;
    fi ;
  else
    mkdir $j/$k/$l;
    git mv $i $j/$k/$l/$m;
  fi ;
done

Using the TracWiki2MediaWiki.pl perl script available from https://www.xpra.org/trac/ticket/2967 👍:

for i in $(find ./ZOO-Project-svn2git.wiki/ -name "*wiki"); 
do 
  python ~/migration/rewrite_wiki.py $i > $i.after;
  mv ${i}.after $i;
  perl ~/Downloads/TracWiki2MediaWiki.pl $i; 
  mv ${i}.after $i;
done

Following this instructions, it gives the expected result. Nevertheless, the directories, even if present in the ZOO-Project.wiki repo, they are not accessible when online.

gfenoy commented 3 years ago

Convert again to markdown syntaxe this time.

for i in $(find . -name "*wiki") ; 
do
  pandoc --from mediawiki --to markdown $i -o $(echo $i | sed "s:.wiki:.md:g"); 
done

Doing so does not solve issue related to using directory and subdirectory from GitHub wiki.

So, I started again from the initial ZOO-Project-svn2git.wiki repo and run the following commands:

for i in $(find . -name "*.wiki" | sed "s:./::" ); do sed "s#{{TracNotice|{{PAGENAME}}}}##g;s#{{TracNotice|{{PAGENAME</pre>}##g" $i > $i.after;  python3 ../rewrite_wiki1.py $i.after> $i ; done

rm $(find . -name "*after")

for i in $(find . -name "*.wiki" | sed "s:./::" ); 
do
  j="$(echo $i | cut -d'/' -f2 | grep wiki)";
  k="$(echo $i | cut -d'/' -f3 | grep wiki)";
  l=$(echo $i | cut -d'/' -f4 | grep wiki);
  m=$(echo $i | cut -d'/' -f1);
  n=$(echo $i | cut -d'/' -f2); 
  o=$(echo $i | cut -d'/' -f3);
  p=$(echo $i | cut -d'/' -f1 | grep md);
  echo $i $j $k $l $m $n $o;
  if [ -z "$p" ] ; then
    if [ -z "$j" ] ; then
      if [ -z "$k" ] ; then
        if [ -z "$l" ] ; then
          echo later;
        else 
          git mv  $i "$m/$n/$o/$m: $n: $o: $l" ;
       fi;
     else
       git mv  $i "$m/$n/$m: $n: $k" ;
     fi ;
    else
     git mv  $i "$m/$m: $j";
    fi ;
  fi ; 
done

For every file in the directory A, the wiki page name should be prefixed with A:. For instance, for a wiki page previously named: PSC/meetings, the page is now accessible with the following name PSC:-meeting.

Sample pages:

Remaining issues:

gfenoy commented 3 years ago

Trac wiki pages attachements can be downloaded using the zip export. Using the following command on the Trac host, it provides folders containing the archive.zip containing all attached file to a wiki page.

for i in $(trac-admin /home/trac/suivi-zoo-project/trac/ wiki list | \
  grep -v "\-\-\-\-" | grep -v "Title " | awk {'print $1'}); 
do 
  echo $i ;
  j=$(trac-admin /home/trac/suivi-zoo-project/trac/ attachment list wiki:$i |\
    grep -v "\-\-\-\-" | grep -v "Name " | wc -l) ;
  if [ "$j" -lt "3" ] ; then 
    echo No need; 
  else 
    mkdir -p ${i}_attachments; 
    wget -O ${i}_attachments/archive.zip http://zoo-project.org/trac/zip-attachment/wiki/$i/ ||\
     (echo "**** FAILED ****"; rm ${i}_attachments/archive.zip) ; 
  fi ;
done

Décompression des archives puis suppression.

for i in $(find . -name "*zip"); 
do 
  unzip $i -d $(echo $i | sed "s:archive.zip::") ;
  rm $i;
done

So for example, in the Invitation page for the ZOO-Projectt workshop in 2013, we were referring to an image which is now available in the ZOO-project.wiki repo here.

Some archive cannot be produced, they will be processed later on. The empty directories in ZOO-Project.wiki repo correspond to the failing archive download.

jmckenna commented 3 years ago

@gfenoy nice work on getting the attachments included!

Regarding the wiki pages, I think a wiki page with the previous name of PSC/meetings must become a new page named PSC-meetings (notice there is no ":" in the name)

gfenoy commented 3 years ago

Images are now available from wiki pages.

Sample:

Note:

I think that we are done with the wiki.

gfenoy commented 3 years ago

As discussed during the today PSC meeting we should now consider the ZOO-Project/ZOO-project.wiki repo as the officia wiki for the ZOO-Project.

I close this issue.

In case of any issue with current wiki page or the requirement to integrate some wiki pages lost during the move, please, feel free to re-open it.

omshinde commented 3 years ago

The default formatting style for the imported wiki pages is Mediawiki. So, for the new wiki pages, shall we follow the same MediaWiki formatting or shall we use the Markdown formatting (which is widely used with Github wiki pages)?

venka-foss4g commented 3 years ago

I think markdown could be better.