gohugoio / hugo

The world’s fastest framework for building websites.
https://gohugo.io
Apache License 2.0
75.45k stars 7.5k forks source link

multilingual: add config option to merge content with default language content #5612

Closed gcushen closed 1 year ago

gcushen commented 5 years ago

Ever since the multilingual capability was added to Hugo, Academic has received a huge number of requests for having the same content in multiple languages without duplication.

For example, you have an English and Chinese version of your site. You mostly post in English and would like the English content available on the Chinese version of your site, for pages that have not already been translated, so that the Chinese version is complete and not missing any content.

Whilst the lang.Merge function was added to help solve this problem, it was designed more around the use case of building a theme yourself with a specific use case in mind.

It's impractical to implement lang.Merge in a large theme with many content types as it involves changing almost every range statement and adding various Go templating logic to create a robust solution, allowing user to choose if they wish to merge content and if so, which language is the default language to merge with.

I propose adding a parameter to config.toml to ask the user if they wish to automatically merge their multilingual content with their default language content. Then you have more or less a single switch for merging content which simplifies the user experience too. Of course, it would involve some changes to backend Hugo functionality too in order to handle merging automatically.

I know there are numerous threads on lang.Merge related issues, so I apologise if I missed a similar post!

bep commented 5 years ago

Whilst the lang.Merge function was added to help solve this problem, it was designed more around the use case of building a theme yourself with a specific use case in mind.

No, lang.Merge was not added to solve this problem. This is another problem.

bep commented 5 years ago

Note that I agree that we need something like this, but there are some considerations, quick notes:

fthorns commented 5 years ago

I'd love to see this feature in Hugo. It would be particularly helpful for pages which consist mostly of a front matter whose values are rendered by an i18l template. In this case it would even make sense to have the resulting pages within each of the different languages, as they would be unique to that language due to the theme.

It might be worth considering front matter variables to disable this for a specific page and, in addition, have black- and a whitelists of languages, too.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help. If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open. If this is a feature request, and you feel that it is still relevant and valuable, please tell us why. This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

inwardmovement commented 4 years ago

Here is my vision of this (sorry if it doesn't add much to the topic but it may clarify things).

As far as I understand, we handle language fallback

But we don’t handle automatic merging of missing content generally speaking. It would be great to have the translation automatically merged/duplicated into the missing content language folder, given a structure like the following:

/content/en/
    post-1.md
    post-2.md
/content/fr/
    post-1.md

It would generate:

/en/post-1
/en/post-2
/fr/post-1          < manually translated
/fr/post-2          < automatically duplicated/merged from en content
likewise commented 4 years ago

Similary

/content/
  post-1.md
  post-2.md
  post-1.fr.md

would generate the same.

I.e. Hugo should iterate over all combinations of {entries, languages} and for each missing combination pick the content with lowest weight value.

Here is a Makefile and shell script that generates the missing .nl translations as soft-links from their default counterparts. It performs a lot of tests not to be destructive, but be sure to have a backup. It generates a remove.sh script to get rid of the soft-links again in a non-destructive manner.

Makefile:

all:
    ./remove.sh && rm -v remove.sh || true
    ./translate.sh
    nice hugo -D
    ./remove.sh && rm -v remove.sh

translate.sh:

#!/bin/sh

if [ -f remove.sh ]; then
  echo "remove.sh exists, exiting."
  exit
else
  echo -e "#!/bin/sh\n" >remove.sh
  chmod +x remove.sh
fi 

REMOVE="$PWD/remove.sh"
for dirpath in content/pages content/posts; do
echo In $dirpath:
pushd $dirpath >/dev/null
for file in *.*; do

  # skip if not an actual file
  if [ ! -f $file -a ! -L $file ]; then continue; fi

  # skip existing .nl translations
  echo -n $file | grep -ve '.nl.md$' -ve '.nl.html$' >/dev/null
  if [ $? != 0 ]; then continue; fi

  file=`basename $file`

  # get file path without (last) suffix (.md, .html, etcetera)
  # sed replaces period followed by non-periods with nothing
  base=`echo $file | sed 's/\.[^.]*$//'`
  # get suffix, only the part after last period
  suffix=`echo $file | sed 's/.*\.\([^.]*$\)/\1/'`

  # assemble translated file path
  translation=${base}.nl.${suffix}
  # check if translated file already exists
  if [ ! -f $translation ]; then
    echo "created $dirname/$translation soft-link"
    ln -s $file $translation
    echo "if [ -L \"$dirpath/$translation\" ]; then" >>$REMOVE
    echo "  rm -v -- \"$dirpath/$translation\"" >>${REMOVE}
    echo "fi" >>${REMOVE}
  else
    if [ -L $translation ]; then
      dest=`readlink -f $translation`
      target=`basename $dest`
      if [ x$target == x$file ]; then
        echo "$translation already is a soft-link to default, skipping"
        echo "if [ -L \"$dirpath/$translation\" ]; then" >>$REMOVE
        echo "  rm -v -- \"$dirpath/$translation\"" >>${REMOVE}
        echo "fi" >>${REMOVE}
      else
        echo "$translation already is a soft-link to $dest, skipping"
      fi
    fi
  fi
done
popd >/dev/null
done
kirillaristov commented 4 years ago

This is a very convenient and necessary function to fill in pages with missing languages from the main language. I used the flat-file cms Grav, it implemented exactly the logic described by the author of the topic.

NicoHood commented 3 years ago

Any updates on this one?

Did I understand correct, that if mypost.de.md is missing, mypost.md currently is not loaded? At least that is what I am seeing. This would mean I have to translate every page, I cannot leave some pages in english, as they would return 404. Is this what the issue is about, or am I seeing a different problem?

jmooring commented 1 year ago

This is possible with v0.96.0 and later with content mounts. See this example.

github-actions[bot] commented 1 year ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.