agroportal / project-management

Repository used to consolidate documentation about the AgroPortal project and track content related issues.
http://agroportal.lirmm.fr
7 stars 0 forks source link

Server down 15/12/2022 21H #335

Closed jonquet closed 4 months ago

jonquet commented 1 year ago

http://agroportal.lirmm.fr/ontologies/EBO

jonquet commented 1 year ago

Same issue with DMO Core

syphax-bouazzouni commented 1 year ago

It is working without me changing anything (except @jonquet restarting the server)

image

It is certainly because, at the time that you wrote this issue, a server downtime occurred. More investigation needs to be done. by seeing the logs for 15/12/2021 between 20h-22h

syphax-bouazzouni commented 1 year ago

Investigation

the logs of /var/logs/messages of Dec 15 2022 from 21:06:53 to 21:07:02

Dec 15 21:06:53 agroportal kernel: pool invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0
Dec 15 21:06:53 agroportal kernel: pool cpuset=/ mems_allowed=0
Dec 15 21:06:53 agroportal kernel: CPU: 4 PID: 10149 Comm: pool Not tainted 3.10.0-1160.53.1.el7.x86_64 #1
Dec 15 21:06:53 agroportal kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Dec 15 21:06:53 agroportal kernel: Call Trace:
Dec 15 21:06:53 agroportal kernel: [<ffffffff81b83579>] dump_stack+0x19/0x1b
Dec 15 21:06:53 agroportal kernel: [<ffffffff81b7e618>] dump_header+0x90/0x229
Dec 15 21:06:53 agroportal kernel: [<ffffffff81506992>] ? ktime_get_ts64+0x52/0xf0
Dec 15 21:06:53 agroportal kernel: [<ffffffff8155e01f>] ? delayacct_end+0x8f/0xb0
Dec 15 21:06:53 agroportal kernel: [<ffffffff815c254d>] oom_kill_process+0x2cd/0x490
Dec 15 21:06:53 agroportal kernel: [<ffffffff815c1f3d>] ? oom_unkillable_task+0xcd/0x120
Dec 15 21:06:53 agroportal kernel: [<ffffffff815c2c3a>] out_of_memory+0x31a/0x500
Dec 15 21:06:53 agroportal kernel: [<ffffffff815c9854>] __alloc_pages_nodemask+0xad4/0xbe0
Dec 15 21:06:53 agroportal kernel: [<ffffffff8161cc49>] alloc_pages_vma+0xa9/0x200
Dec 15 21:06:53 agroportal kernel: [<ffffffff815f6837>] handle_mm_fault+0xcb7/0xfb0
Dec 15 21:06:53 agroportal kernel: [<ffffffff81b90653>] __do_page_fault+0x213/0x500
Dec 15 21:06:53 agroportal kernel: [<ffffffff81b90a26>] trace_do_page_fault+0x56/0x150
Dec 15 21:06:53 agroportal kernel: [<ffffffff81b8ffa2>] do_async_page_fault+0x22/0xf0
Dec 15 21:06:53 agroportal kernel: [<ffffffff81b8c7a8>] async_page_fault+0x28/0x30
Dec 15 21:06:53 agroportal kernel: Mem-Info:
Dec 15 21:06:53 agroportal kernel: active_anon:18307712 inactive_anon:817318 isolated_anon:0#012 active_file:3421 inactive_file:489
6 isolated_file:32#012 unevictable:0 dirty:5 writeback:0 unstable:0#012 slab_reclaimable:28969 slab_unreclaimable:17666#012 mapped:
4527 shmem:94269 pagetables:54012 bounce:0#012 free:94383 free_pcp:416 free_cma:0
Dec 15 21:06:53 agroportal kernel: Node 0 DMA free:15908kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_fil
e:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB
 writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce
:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Dec 15 21:06:53 agroportal kernel: lowmem_reserve[]: 0 2826 76340 76340
Dec 15 21:06:53 agroportal kernel: Node 0 DMA32 free:297316kB min:2500kB low:3124kB high:3748kB active_anon:2048936kB inactive_anon
:525316kB active_file:276kB inactive_file:4292kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129304kB managed:28
94776kB mlocked:0kB dirty:0kB writeback:0kB mapped:200kB shmem:5052kB slab_reclaimable:4492kB slab_unreclaimable:2976kB kernel_stac
k:256kB pagetables:9356kB unstable:0kB bounce:0kB free_pcp:832kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:455 all_unreclaimable? no
Dec 15 21:06:53 agroportal kernel: lowmem_reserve[]: 0 0 73513 73513
Dec 15 21:06:53 agroportal kernel: Node 0 Normal free:70000kB min:65068kB low:81332kB high:97600kB active_anon:71181912kB inactive_anon:2743956kB active_file:13408kB inactive_file:13968kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:76546048kB managed:75280884kB mlocked:0kB dirty:20kB writeback:0kB mapped:17908kB shmem:372024kB slab_reclaimable:111384kB slab_unreclaimable:67688kB kernel_stack:9584kB pagetables:206692kB unstable:0kB bounce:0kB free_pcp:848kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:96 all_unreclaimable? no
Dec 15 21:06:53 agroportal kernel: lowmem_reserve[]: 0 0 0 0
Dec 15 21:06:53 agroportal kernel: Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
Dec 15 21:06:53 agroportal kernel: Node 0 DMA32: 138*4kB (UEM) 163*8kB (UEM) 253*16kB (UEM) 150*32kB (UEM) 103*64kB (UEM) 29*128kB (UEM) 11*256kB (UE) 193*512kB (UEM) 171*1024kB (UEM) 0*2048kB 0*4096kB = 297744kB
Dec 15 21:06:53 agroportal kernel: Node 0 Normal: 389*4kB (UEM) 758*8kB (UEM) 808*16kB (UEM) 474*32kB (UEM) 353*64kB (UEM) 95*128kB (UEM) 5*256kB (UEM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 71748kB
Dec 15 21:06:53 agroportal kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Dec 15 21:06:53 agroportal kernel: 258889 total pagecache pages
Dec 15 21:06:53 agroportal kernel: 156114 pages in swap cache
Dec 15 21:06:53 agroportal kernel: Swap cache stats: add 1397110, delete 1240986, find 236945947/237136453
Dec 15 21:06:53 agroportal kernel: Free swap  = 0kB
Dec 15 21:06:53 agroportal kernel: Total swap = 2097148kB
Dec 15 21:06:53 agroportal kernel: 19922836 pages RAM
Dec 15 21:06:53 agroportal kernel: 0 pages HighMem/MovableOnly
Dec 15 21:06:53 agroportal kernel: 374944 pages reserved
Dec 15 21:06:53 agroportal kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Dec 15 21:06:53 agroportal kernel: [  539]     0   539    14162     4076      32       95             0 systemd-journal
Dec 15 21:06:53 agroportal kernel: [  570]     0   570    66027        0      30      116             0 lvmetad
Dec 15 21:06:53 agroportal kernel: [  576]     0   576    11455        2      22      224         -1000 systemd-udevd
Dec 15 21:06:53 agroportal kernel: [  708]     0   708    13883       43      27       80         -1000 auditd
Dec 15 21:06:53 agroportal kernel: [  729]     0   729     6596       93      19       36             0 systemd-logind
Dec 15 21:06:53 agroportal kernel: [  740]   999   740   153088      122      65     2309             0 polkitd
Dec 15 21:06:53 agroportal kernel: [  741]     0   741     5407       57      15       44             0 irqbalance
Dec 15 21:06:53 agroportal kernel: [  750]    81   750    14538      110      33      100          -900 dbus-daemon
Dec 15 21:06:53 agroportal kernel: [  754]   998   754    29452       66      29       85             0 chronyd
Dec 15 21:06:53 agroportal kernel: [  783]     0   783    31598       69      19      128             0 crond
Dec 15 21:06:53 agroportal kernel: [  784]     0   784    27552        1      10       33             0 agetty
Dec 15 21:06:53 agroportal kernel: [ 1011]     0  1011   143571      714      97     3196             0 tuned
Dec 15 21:06:53 agroportal kernel: [ 1014]     0  1014    28246        1      58      259         -1000 sshd
Dec 15 21:06:53 agroportal kernel: [ 1024]     0  1024    57309      887      67      526             0 snmpd
Dec 15 21:06:53 agroportal kernel: [ 1032]     0  1032   113490     2562      97      113             0 rsyslogd
Dec 15 21:06:53 agroportal kernel: [ 1187]    27  1187    28354        1      11       74             0 mysqld_safe
Dec 15 21:06:53 agroportal kernel: [ 1779]    27  1779   348607     7918      99     7634             0 mysqld
Dec 15 21:06:53 agroportal kernel: [ 1873]     0  1873    22447       34      44      241             0 master
Dec 15 21:06:53 agroportal kernel: [ 1902]    89  1902    22517       31      43      235             0 qmgr
Dec 15 21:06:53 agroportal kernel: [ 3029]   995  3029    27021       82      54      216             0 4s-boss
Dec 15 21:06:53 agroportal kernel: [ 3043]   995  3043    42727      105      49      163             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [ 3078]   887  3078   334773   190858     566      511             0 redis-server
Dec 15 21:06:53 agroportal kernel: [ 3098]   887  3098   497589    85348     897       37             0 redis-server
Dec 15 21:06:53 agroportal kernel: [ 3114]   887  3114   619445   386542    1155   127444             0 redis-server
Dec 15 21:06:53 agroportal kernel: [ 3159]   994  3159   257142    38645     286    95557             0 mgrep
Dec 15 21:06:53 agroportal kernel: [ 3167]   994  3167    86130        1     175    82462             0 mgrep
Dec 15 21:06:53 agroportal kernel: [ 3174]   993  3174   167528   103561     262    15215             0 memcached
Dec 15 21:06:53 agroportal kernel: [ 3211]   888  3211    95526     1885     138    19321             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 3325]   996  3325  2580605   180174    1809    41519             0 java
Dec 15 21:06:53 agroportal kernel: [ 3356]     0  3356     9826       23      21      217             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3357]   997  3357     9953      147      22      275             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3358]   997  3358     9953      147      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3359]   997  3359     9953      147      22      275             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3360]   997  3360     9953      146      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3361]   997  3361     9953      147      22      275             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3362]   997  3362     9953      112      22      269             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3363]   997  3363     9953      145      22      277             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3364]   997  3364     9953      104      22      277             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3365]   997  3365     9953      146      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3366]   997  3366     9953      146      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3367]   997  3367     9953      115      22      275             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3368]   997  3368     9953      105      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3370]   997  3370     9953      146      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3371]   997  3371     9953      146      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3372]   997  3372     9953      146      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3373]   997  3373     9953      122      22      276             0 nginx
Dec 15 21:06:53 agroportal kernel: [ 3380]     0  3380    91152     1068     146       56             0 httpd
Dec 15 21:06:53 agroportal kernel: [ 3614]   888  3614   509123    15256     171     3171             0 ruby
Dec 15 21:06:53 agroportal kernel: [ 5876]    53  5876  6795688  1421556    3131    47342             0 java
Dec 15 21:06:53 agroportal kernel: [ 4701]   995  4701    29389      264      56        0             0 4s-httpd
Dec 15 21:06:53 agroportal kernel: [19167]     0 19167    91056      588      51        0         -1000 PassengerAgent
Dec 15 21:06:53 agroportal kernel: [19170]     0 19170   899679     3411     136        0             0 PassengerAgent
Dec 15 21:06:53 agroportal kernel: [19211]    48 19211    91152     1011     142       55             0 httpd
Dec 15 21:06:53 agroportal kernel: [19246]    48 19246   556075     5489     196       54             0 httpd
Dec 15 21:06:53 agroportal kernel: [19247]    48 19247   556075     5045     196       54             0 httpd
Dec 15 21:06:53 agroportal kernel: [21478]   888 21478   159972    31216     172     1311             0 bundle
Dec 15 21:06:53 agroportal kernel: [22882]   888 22882   167380    40404     187     1112             0 bundle
Dec 15 21:06:53 agroportal kernel: [23857]   888 23857   160608    33138     173     1156             0 bundle
Dec 15 21:06:53 agroportal kernel: [23980]   888 23980   163159    33968     178     1222             0 bundle
Dec 15 21:06:53 agroportal kernel: [26454]   888 26454   163815    35634     180     1125             0 bundle
Dec 15 21:06:53 agroportal kernel: [26487]   888 26487   163602    32923     179     1275             0 bundle
Dec 15 21:06:53 agroportal kernel: [26614]   888 26614   188300    64551     230      668             0 bundle
Dec 15 21:06:53 agroportal kernel: [29402]   888 29402   177981    72410     250        0             0 ruby
Dec 15 21:06:53 agroportal kernel: [29587]   995 29587 17220417 15896914   31144        0             0 4s-httpd
Dec 15 21:06:53 agroportal kernel: [29588]   995 29588  1642268     4914    2076      138             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [29600]   995 29600  1638412      960    2063      138             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [29612]   995 29612  1638156      796    2079      138             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [29619]   995 29619  1638182      849    2073      138             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [ 3690]   888  3690   161023    31362     174     1315             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 3693]   888  3693   184979    56714     220      960             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 3736]   888  3736   156969    26694     165     1392             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 4614]   888  4614   155435    25800     162     1371             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 5011]   888  5011   159686    29458     170     1321             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 7315]    89  7315    22473      317      44        0             0 pickup
Dec 15 21:06:53 agroportal kernel: [10146]   888 10146   507071    18349     172     1998             0 scheduler.rb:4*
Dec 15 21:06:53 agroportal kernel: [19247]    48 19247   556075     5045     196       54             0 httpd
Dec 15 21:06:53 agroportal kernel: [21478]   888 21478   159972    31216     172     1311             0 bundle
Dec 15 21:06:53 agroportal kernel: [22882]   888 22882   167380    40404     187     1112             0 bundle
Dec 15 21:06:53 agroportal kernel: [23857]   888 23857   160608    33138     173     1156             0 bundle
Dec 15 21:06:53 agroportal kernel: [23980]   888 23980   163159    33968     178     1222             0 bundle
Dec 15 21:06:53 agroportal kernel: [26454]   888 26454   163815    35634     180     1125             0 bundle
Dec 15 21:06:53 agroportal kernel: [26487]   888 26487   163602    32923     179     1275             0 bundle
Dec 15 21:06:53 agroportal kernel: [26614]   888 26614   188300    64551     230      668             0 bundle
Dec 15 21:06:53 agroportal kernel: [29402]   888 29402   177981    72410     250        0             0 ruby
Dec 15 21:06:53 agroportal kernel: [29587]   995 29587 17220417 15896914   31144        0             0 4s-httpd
Dec 15 21:06:53 agroportal kernel: [29588]   995 29588  1642268     4914    2076      138             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [29600]   995 29600  1638412      960    2063      138             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [29612]   995 29612  1638156      796    2079      138             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [29619]   995 29619  1638182      849    2073      138             0 4s-backend
Dec 15 21:06:53 agroportal kernel: [ 3690]   888  3690   161023    31362     174     1315             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 3693]   888  3693   184979    56714     220      960             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 3736]   888  3736   156969    26694     165     1392             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 4614]   888  4614   155435    25800     162     1371             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 5011]   888  5011   159686    29458     170     1321             0 bundle
Dec 15 21:06:53 agroportal kernel: [ 7315]    89  7315    22473      317      44        0             0 pickup
Dec 15 21:06:53 agroportal kernel: [10146]   888 10146   507071    18349     172     1998             0 scheduler.rb:4*
Dec 15 21:06:53 agroportal kernel: [10273]   888 10273   117657    26467     167        0             0 ruby
Dec 15 21:06:53 agroportal kernel: [10439]   888 10439   152902    33482     183        0             0 ruby
Dec 15 21:06:53 agroportal kernel: [10469]   888 10469   151872    31462     177        0             0 ruby
Dec 15 21:06:53 agroportal kernel: [10489]   888 10489   151897    31386     177        0             0 ruby
Dec 15 21:06:53 agroportal kernel: [10509]   888 10509   152462    29300     177        0             0 ruby
Dec 15 21:06:53 agroportal kernel: [10530]   888 10530   134529    26416     162        0             0 ruby
Dec 15 21:06:53 agroportal kernel: Out of memory: Kill process 29587 (4s-httpd) score 793 or sacrifice child
Dec 15 21:06:53 agroportal kernel: Killed process 29587 (4s-httpd), UID 995, total-vm:68881668kB, anon-rss:63587444kB, file-rss:212kB, shmem-rss:0kB
Dec 15 21:07:02 agroportal 4store[4701]: httpd.c:1979 child 29587 terminated by signal 9
Dec 15 21:07:02 agroportal 4s-httpd: 4store[4701]: httpd.c:1979 child 29587 terminated by signal 9
syphax-bouazzouni commented 1 year ago

@alexskr maybe you can help me translate this(thanks).

4store got killed (not a surprise), but in this case, the cause may be our fault, and there was no more memory in our server.

alexskr commented 1 year ago

that system was running low on memory so oom-killer (out of memory killer) terminated a process to free up some memory. In this case oom-killer terminated 4s-httpd. 4s-httpd is the "front" end of the 4store which processes SPARQL queries.

syphax-bouazzouni commented 1 year ago

But is it normal for a server of 8 CPUs, 32 Gb to be that low on memory? if yes, what system requirement would be good to have for our current situation (200 ontologies and a hundred connections per day) to avoid that?

alexskr commented 1 year ago

It could happen especially when a large ontology is parsed. Owlapi java process can take up to 10GB of RAM (or whatever LinkedData.java_max_heap_size is set at) Normally 4s-http doesn't take a lot ram but once in a while, it can get into a funky state where 4s-http memory utilization goes up and stays bloated until it is restarted.

BioPortal runs 4store on a dedicated VM with plenty of RAM (64GB). 64GB is overkill but extra ram is very helpful with disk caching. And we also run ncbo_cron on a dedicated VM so that it doesn't interfere with 4store and other processes.

syphax-bouazzouni commented 4 months ago

Close as duplicate of https://github.com/ontoportal-lirmm/bioportal_web_ui/issues/8