[GRUB] error: timeout: could not resolve hardware address

RicardoJeronimo commented 5 years ago

Issue details:

Issue type: Question
Impact: Critical
How often it happens: Always
Brief description of the issue: Hello!

After installing DRLM 2.3.0, I've encountered some issues. Firstly, in order for the client to obtain the configuration files, I must not configure HTTP on the server. If I do configure it, following the latest documentation, there is no /usr/share/drlm/conf/HTTP/https.conf available in the branch and, because of that, I can't start httpd. If I download that file manually and place it in the correct directory, somehow, the certificate becomes invalid and I'm back at square one. The client can, however, download the configuration files if I skip the HTTP configuration. Why does this happen?

Now for the reason I opened this issue: After skipping the HTTP configuration, and executing runbackup successfully, I tried to test the recovery. The client obtains the IP from the DRLM DHCP but then hangs at "Welcome to GRUB!" with error: timeout: could not resolve hardware address. Here is a screenshot of said error: screenshot_1

How can I solve this problem?

Thank you!

krbu commented 5 years ago

Hi Ricardo,

You're rigth , the version 2.3.0 is still on development and is not using Apache anymore we have a new service listening on port 443, once we finish all the testings we'll update the documentation. The second issue is because of the driver VMXNET3, try to create a new network interface with E1000 , it should work.

RicardoJeronimo commented 5 years ago

Hi, @krbu!

Thank you, that did the trick. I was able to boot into DRLM Recovery and run rear recover. I'm a bit lost, however, due to the Client Recover documentation being for version 1.17.2, now, I'm asked to start the restore process on my backup host. How can I do this?

krbu commented 5 years ago

Hi Ricardo, restore has not changed , following the documentation for version 1.17.2 should work.

RicardoJeronimo commented 5 years ago

Hi, @krbu.

Here is a screenshot of what I was talking about:

As you can see, it happens after I run rear recover and I'm not quite sure how to proceed. In the latest documentation, that shell doesn't appear anywhere.

krbu commented 5 years ago

Hi Ricardo, looks like the IP addres is not well settled , can you let me see the exit of the following commands? $ curl -k https:// $ ip a

RicardoJeronimo commented 5 years ago

Here are the outputs:

$ curl -k https://10.192.5.60 DRLM SERVER

$ ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:50:56:a7:9d:79 brd ff:ff:ff:ff:ff:ff
    inet 10.192.5.59/24 brd 10.192.5.255 scope global ens224
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fea7:9d79/64 scope link
       valid_lft forever preferred_lft forever

krbu commented 5 years ago

Hi,

try: rear recover SERVER=10.192.5.60 REST_OPTS=-k ID=centos2

RicardoJeronimo commented 5 years ago

Hey,

It returned the exact same output and opened the same shell

krbu commented 5 years ago

Hi, can I see the result of the commands: $ curl -k https://10.192.5.60/clients/centos2 and from de DRLM server $ df -h $ drlm listbackup

i the nfs-server service up in the drlm server?

RicardoJeronimo commented 5 years ago

Hello,

Here are the outputs (sorry for the image but my centos2 machine doesn't boot anymore):

$ curl -k https://10.192.5.60/clients/centos2

$ df -h

Filesystem               Size  Used Avail Use% Mounted on
/dev/mapper/centos-root   50G  7.0G   43G  15% /
devtmpfs                 908M     0  908M   0% /dev
tmpfs                    920M     0  920M   0% /dev/shm
tmpfs                    920M   17M  903M   2% /run
tmpfs                    920M     0  920M   0% /sys/fs/cgroup
/dev/mapper/centos-home   24G   33M   24G   1% /home
/dev/sda1               1014M  184M  831M  19% /boot
/dev/loop101             5.4G  1.1G  4.3G  20% /var/lib/drlm/store/centos2
/dev/loop104             5.5G  997M  4.5G  18% /var/lib/drlm/store/suse1
tmpfs                    184M     0  184M   0% /run/user/0

$ drlm listbackup

Backup Id            Client Name     Backup Date        Backup Status   Duration        Backup Size
101.20181123065244   centos2         2018-11-23 06:52   enabled         0h.3m.35s       1.2G         
104.20181126154542   suse1           2018-11-26 15:45   enabled         0h.4m.10s       1.1G

And yes, the nsf-server service is active

proura commented 5 years ago

Hello @RicardoJeronimo,

Can you tray to edit the file /etc/rear/local.conf and change the line DRLM_REST_OPTS="--capath /etc/rear/cert" to DRLM_REST_OPTS="-k" before run the comand "rear recover" when you are in rescue mode.

RicardoJeronimo commented 5 years ago

Hi, @proura,

It works! The machine was recovered successfully, thank you!

If I may, I have some questions:

Why did this happen? Is it a bug of some sorts?
I noticed that rear restore did also restore some files I had in the root directory. Does this mean I don't need to integrate Bacula or Bareos with DRLM to have file backup?
I'm also having some problems with a restore on a OpenSUSE Leap 15 machine. The process throws the error "The disk layout recreation script failed" and, more specifically, the log file contains these errors:
```
rpc.idmapd: libnfsidmap: requested translation method, 'nsswitch', is not available
rpc.idmapd unable to create name to user id mappings
Starting rpc.idmapd failed.
mount: /proc/fs/nfsd: unknown filesystem type 'nfsd'
```
What could be the cause of this?

Thank you again!

krbu commented 5 years ago

Hi Ricardo,

About the first question, looks like there is some strange behaviour with the certificates on centos 7 , we are testing and working on it.
Regarding recovered files on /root , yes , ReaR can recover the whole machine, you specify and include and exclude what you want. More you copy more time you need for the restore. The idea is tho recover the SO quiely and then recover DDBB, Apps, or big files , but its up to you. You can also copy a database using pre and post scripts to stop and start the DB having a consistent copy of the database. Take a look on /etc/drlm/clients/centos2.cfg (here you have the configuration of the stuff youŕe going to copy). To know nore about ReaR visit http://relax-and-recover.org/documentation/
For the last question, please open a new issue, OpenSUSE Leap 15 is still on testing

RicardoJeronimo commented 5 years ago

Hello, @krbu,

I'll open a new issue for the SUSE machine, then. Thank you for that insight, it was really helpful!

brainupdaters / drlm

[GRUB] error: timeout: could not resolve hardware address #92

Issue details: