ProgrammeVitam / vitam

Digital Archives Management System developped by French government/Programme interministériel archives numériques ; core system.
CeCILL Free Software License Agreement v2.1
122 stars 41 forks source link

Erreur Disable THP sur Débian Stretch (instance Scaleway) #10

Closed bxaxa closed 4 years ago

bxaxa commented 5 years ago

Hello, en faisant le déploiement sur une instance Scaleway, le playbook plante en essayant de démarrer le service qui disable les transparent_hugepages (THP) parce que les transparent_hugepages ne sont pas implémentés leur kernel de base.

L'erreur est dans ce fichier :

vitam/deployment/ansible-vitam/roles/mongo_common/tasks/main.yml

Le workaround rapide est de commenter la partie incriminée. Je pense qu'il faudrait tester la présence des transparent_hugepages avant d'essayer de démarrer le service qui les disables.

Workaround

cat ansible-vitam/roles/mongo_common/tasks/main.yml

---

# Install mongodb-org (needed by vitam-mongo* packages)
# and then disable the mongod service as it may start listening on 27017 port which is needed by vitam-mongos
- name: Install mongodb-org package
  package:
    name: mongodb-org
    state: latest
  register: result
  retries: "{{ packages_install_retries_number }}"
  until: result is succeeded
  delay: "{{ packages_install_retries_delay }}"

- name: Disable mongodb default service
  service:
    name: mongod
    state: stopped
    enabled: no

### System tuning best practices ####

# next steps in order to disable Transparent HugePages
# cf https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/

- name: Check if tuned is installed
  stat:
    path: /usr/lib/systemd/system/tuned.service
  register: tuned_service_status
  when: ansible_os_family == "RedHat"

- name: Check if tuned is installed
  stat:
    path: /lib/systemd/system/tuned.service
  register: tuned_service_status
  when: ansible_os_family == "Debian"

- set_fact: status_tuned_present="{{ tuned_service_status.stat.exists }}"
  when: tuned_service_status.changed

- block:

    - name: create the tuned conf directory
      file:
        path: "/etc/tuned/no-thp"
        state: directory
        owner: root
        mode: "{{ vitam_defaults.folder.folder_permission }}"

    - name: add the tuned conf file
      copy:
        src: tuned.conf
        dest: /etc/tuned/no-thp/tuned.conf
        owner: root
        mode: "{{ vitam_defaults.folder.conf_permission }}"

    - name: enable the new tuned profile
      command: tuned-adm profile no-thp

    - name: restart tuned
      service:
        name: tuned
        state: restarted

  when:
    - status_tuned_present
    - ansible_virtualization_type != "docker"

- block:

    - name: add systemd service unit to disable THP
      copy:
        src: disable_transparent_hugepages.service
        dest: /usr/lib/systemd/system/disable_transparent_hugepages.service
        owner: root
        mode: 0700
      when: ansible_os_family == "RedHat"

      #- name: add systemd service unit to disable THP
      #copy:
      #  src: disable_transparent_hugepages.service
      #  dest: /lib/systemd/system/disable_transparent_hugepages.service
      #  owner: root
      #  mode: 0700
      #when: ansible_os_family == "Debian"

      #- name: enable systemd service unit to disable THP
      #service:
      #  name: disable_transparent_hugepages.service
      #  enabled: yes
      #  state: started

  when: ansible_virtualization_type != "docker"
croftophile commented 5 years ago

Noté, le correctif devrait être dans les prochaines releases. Merci pour ce retour.

bxaxa commented 5 years ago

Ok, merci. On attend le correctif.

croftophile commented 5 years ago

Pouvez-vous nous envoyer l'erreur remontée sous ansible ? De notre côté, nous cherchons la méthode la plus "propre" pour vérifier les transparent huge pages avant l'éventuelle désactivation.

bxaxa commented 5 years ago

Dés que je reinstall je vous renvoit l'erreur dans l'issue

bxaxa commented 5 years ago

Voici l'erreur au déploiement

TASK [mongo_common : enable systemd service unit to disable THP] *****************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Unable to start service disable_transparent_hugepages.service: Job for disable_transparent_hugepages.service failed because the control process exited with error code.\nSee \"systemctl status disable_transparent_hugepages.service\" and \"journalctl -xe\" for details.\n"}

systemctl status disable_transparent_hugepages.service

● disable_transparent_hugepages.service - Disable transparent huge pages
   Loaded: loaded (/lib/systemd/system/disable_transparent_hugepages.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2019-07-30 14:38:52 UTC; 1min 43s ago
  Process: 905 ExecStart=/bin/sh -c echo never > /sys/kernel/mm/transparent_hugepage/enabled (code=exited, status=2)
 Main PID: 905 (code=exited, status=2)

Jul 30 14:38:52 vitam-test systemd[1]: Starting Disable transparent huge pages...
Jul 30 14:38:52 vitam-test sh[905]: /bin/sh: 1: cannot create /sys/kernel/mm/transparent_hugepage/enabled: Directory nonexistent
Jul 30 14:38:52 vitam-test systemd[1]: disable_transparent_hugepages.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jul 30 14:38:52 vitam-test systemd[1]: Failed to start Disable transparent huge pages.
Jul 30 14:38:52 vitam-test systemd[1]: disable_transparent_hugepages.service: Unit entered failed state.
Jul 30 14:38:52 vitam-test systemd[1]: disable_transparent_hugepages.service: Failed with result 'exit-code'.

Test pour valider la présence des thp

mkdir -p  /sys/kernel/mm/transparent_hugepage
mkdir: cannot create directory ‘/sys/kernel/mm/transparent_hugepage’: Operation not permitted
echo 0 > /sys/kernel/mm/transparent_hugepage/enabled
-bash: /sys/kernel/mm/transparent_hugepage/enabled: No such file or directory

En gros, ni le répertoire /sys/kernel/mm/transparent_hugepage, ni le fichier /sys/kernel/mm/transparent_hugepage/enabled n'existent

Et il est impossible de le créer

croftophile commented 5 years ago

Un contournement possible :

---
# Install mongodb-org (needed by vitam-mongo* packages)
# and then disable the mongod service as it may start listening on 27017 port which is needed by vitam-mongos
- name: Install mongodb-org package
  package:
    name: mongodb-org
    state: latest
  register: result
  retries: "{{ packages_install_retries_number }}"
  until: result is succeeded
  delay: "{{ packages_install_retries_delay }}"

- name: Disable mongodb default service
  service:
    name: mongod
    state: stopped
    enabled: no

### System tuning best practices ####

# next steps in order to disable Transparent HugePages
# cf https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/

- name: Check if tuned is installed
  stat:
    path: /usr/lib/systemd/system/tuned.service
  register: tuned_service_status
  when: ansible_os_family == "RedHat"

- name: Check if tuned is installed
  stat:
    path: /lib/systemd/system/tuned.service
  register: tuned_service_status
  when: ansible_os_family == "Debian"

- set_fact: status_tuned_present="{{ tuned_service_status.stat.exists }}"
  when: tuned_service_status.changed

- block:

    - name: create the tuned conf directory
      file:
        path: "/etc/tuned/no-thp"
        state: directory
        owner: root
        mode: "{{ vitam_defaults.folder.folder_permission }}"

    - name: add the tuned conf file
      copy:
        src: tuned.conf
        dest: /etc/tuned/no-thp/tuned.conf
        owner: root
        mode: "{{ vitam_defaults.folder.conf_permission }}"

    - name: enable the new tuned profile
      command: tuned-adm profile no-thp

    - name: restart tuned
      service:
        name: tuned
        state: restarted

  when:
    - status_tuned_present
    - ansible_virtualization_type != "docker"

- name: check /sys/kernel/mm/transparent_hugepage existence
  stat:
    path: /sys/kernel/mm/transparent_hugepage
  register: transparent_hugepage_dir

- name: for debug
  debug:
    msg: "{{ transparent_hugepage_dir }}"

- block:

    - name: add systemd service unit to disable THP
      copy:
        src: disable_transparent_hugepages.service
        dest: /usr/lib/systemd/system/disable_transparent_hugepages.service
        owner: root
        mode: 0700
      when: ansible_os_family == "RedHat"

    - name: add systemd service unit to disable THP
      copy:
        src: disable_transparent_hugepages.service
        dest: /lib/systemd/system/disable_transparent_hugepages.service
        owner: root
        mode: 0700
      when: ansible_os_family == "Debian"

    - name: enable systemd service unit to disable THP
      service:
        name: disable_transparent_hugepages.service
        enabled: yes
        state: started

  when: ( ansible_virtualization_type != "docker") and ( transparent_hugepage_dir.stat.isdir is defined and transparent_hugepage_dir.stat.isdir )
bxaxa commented 5 years ago

Je teste dès que je fait une réinstallation (mais je ne vois pas pourquoi ça ne fonctionnerait pas).

amapi commented 5 years ago

Testé ce matin, ça fonctionne nickel, issue à clore dès que ça sera inclut dans les sources officielles.