opnsense / tools

OPNsense release engineering toolkit
https://opnsense.org/
BSD 2-Clause "Simplified" License
271 stars 194 forks source link

[ARM] Port OPNsense to the Netgate SG3100 #162

Closed rene-bayer closed 4 years ago

rene-bayer commented 4 years ago

Hello all,

i want to get opnsense on my Netgate SG3100 appliance. (i learned a lot during this project ^^)

After hours of debugging, trying and compiling, i got an image, which boots with all the marvell driver loaded <3

But now i'm facing the problem, that OPNsense just gets unresponsive after successful boot :-( image

Any ideas?

All the best, René

rene-bayer commented 4 years ago

when i add a simple "/bin/csh" here it drops me into a shell, and everything seems to work so far.

fichtner commented 4 years ago

very strange spot to "hang".. did you configure auto-login on the console?

rene-bayer commented 4 years ago

Next strange thing ... this only occurs over serial output.

Over SSH everything works as expected, seems to be a very device specific problem

fichtner commented 4 years ago

It's likely /etc/ttys or /etc/login.conf related.

nekoprog commented 4 years ago

@DarkSunOne Can your serial detect input? I tried with usb-serial, but cannot send input to console.

rene-bayer commented 4 years ago

@nekoprog Yes, as said .. when i add a "/bin/csh" here, everything works, and i can configure everything.

@fichtner good hint with ttys ... i get the followig errors, when i reboot the box: image

rene-bayer commented 4 years ago

So, is there anyone who is able to help me create a flashable image?

Currently i dd the arm image to a usb stick, and start it directly from this stickl (u-boot>> usb start ; boot)

I recieved a recovery image from netgate, but i have no idea, who to build a opnsense image like this :-(

It an .img file, this image file contains a 0.fat file, and a 1.img file. 0.fat seems to be the 1 stage bootloader, and 1.img seems to be the root system.

I can also provide this image to one of you if somone is interested or can help :-)

Edit: this is what the recovery image looks like: image

fichtner commented 4 years ago

I can't speak for the image, but at least there's one of your issues: there's no ttyvX devices, maybe it's all USB-based, list your /dev/tty* to see what's there. :)

nekoprog commented 4 years ago

So, is there anyone who is able to help me create a flashable image?

Currently i dd the arm image to a usb stick, and start it directly from this stickl (u-boot>> usb start ; boot)

I recieved a recovery image from netgate, but i have no idea, who to build a opnsense image like this :-(

It an .img file, this image file contains a 0.fat file, and a 1.img file. 0.fat seems to be the 1 stage bootloader, and 1.img seems to be the root system.

I can also provide this image to one of you if somone is interested or can help :-)

Edit: this is what the recovery image looks like: image

If you can create a tree for each of the item listed on 0.fat and 1.img, I can point you somewhere. OPNsense img also structured like that if extracted from .img. Where 0.fat is loaded to mmc/fat/MSDOSBOOT and 1.img is loaded to mmc/ufs/${ARMLABEL} on your flashed microsd card.

0.fat is where arm.sh copies /boot or ${STAGEDIR}/boot from OPNsense to fatmmc/boot/msdos or ${STAGEDIR}/boot/msdos, mind you that these had to mounted as of arm.sh, this is where the process starts for creating 0.fat.

1.img is where ${STAGEDIR} being converted into ufsmmc/ or basically the OS itself.

If you reverse the step, you can get the ${STAGEDIR} structure for Netgate recovery image.

rene-bayer commented 4 years ago

Oh, okay ... well, didn't looked so close to the opnsense image yet ^^

Thanks for the explanation, so basically i just need to mount 1.img from the recovery image, find the recovery image which starts on boot, migrate that to the opnsense Souce and build a image with that 🤔

rene-bayer commented 4 years ago

Okay, i just made a successful flash, i replaced the built in pfsense image with my opnsense image. (Far away from a ready to use in production)

It really boots up fine, but i recieve a lot of UFS and vfs errors: UFS /dev/diskid/DISK-0A4A45A0s2a (/) cylinder checksum failed: cg 2, cgp: 0x5d474673 != bp: 0x2c1c94f6 and

mmcsd0: Error indicated: 2 Bad CRC
g_vfs_done():diskid/DISK-0A4A45A0s2a[WRITE(offset=1047527424, length=32768)]error = 5

And the opnsense shell is still not loaded (so no input is possible)

Feels so good, to have the goal so close <3 image

rene-bayer commented 4 years ago

@fichtner

I can't speak for the image, but at least there's one of your issues: there's no ttyvX devices, maybe it's all USB-based, list your /dev/tty* to see what's there. :)

Here we go (Image is now directly booted from mmc not usb): image

rene-bayer commented 4 years ago

Next hint :) as told before, i added /bin/csh at line 2 for debugging, so i can use the serial shell.

When i start opnsense-shell again (on the cli), everything is working, the problem only occurs with the "first" shell which starts directly after boot image

fichtner commented 4 years ago

ttyu0 and ttyu1 are there... can you check that /etc/ttys on the original image match ours? There's no "login:" prompt so it never finds the primary serial (likely ttyu0) maybe also due to some quirk device hint that needs to be set on the hardware. I really don't know. For details see i.e. https://forum.opnsense.org/index.php?topic=6998.msg31097#msg31097

fichtner commented 4 years ago

Second thought.. have you tried to set the thing to serial on the GUI?

fichtner commented 4 years ago

Reference: https://github.com/opnsense/tools/blob/master/config/20.1/extras.conf#L29-L37

arm was HDMI so not serial yet, nano has dual console, serial has serial console likewise

nekoprog commented 4 years ago

Next hint :) as told before, i added /bin/csh at line 2 for debugging, so i can use the serial shell.

When i start opnsense-shell again (on the cli), everything is working, the problem only occurs with the "first" shell which starts directly after boot image

You might want to start debug from here and look what really happens after system_console_unmute().

Or maybe here since you already shown OPNsense banner. I hope you can find something there.

nekoprog commented 4 years ago

Reference: https://github.com/opnsense/tools/blob/master/config/20.1/extras.conf#L29-L37

arm was HDMI so not serial yet, nano has dual console, serial has serial console likewise

Can we merge arm and nano hook?

rene-bayer commented 4 years ago

Second thought.. have you tried to set the thing to serial on the GUI? Yes, already tried all the settings :-(

will read the links later

... Okay ... i fixed it but i dont like it :-D

When i remove "banner" here everything works as expected, at least it seams so. Why is the banner called here, and not the "normal whole" opnsense-shell?

Edit: okay, well i see, the "program" is running in a while, so nothing is executed after that

fichtner commented 4 years ago

Well, use “set -x” in /usr/local/etc/rc to confirm. Banner is a PHP script but echo-debug should lead you to the blocking call.

On 8. Nov 2019, at 13:30, René Losert notifications@github.com wrote:

 Okay ... i fixed it but i dont like it :-D

When i remove "banner" here everything works as expected, at least it seams so. Why is the banner called here, and not the "normal whole" opnsense-shell?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

fichtner commented 4 years ago

Oh sorry, use set -x in opnsense-shell then. It was changed to launch the banner I see.

On 8. Nov 2019, at 13:30, René Losert notifications@github.com wrote:

 Okay ... i fixed it but i dont like it :-D

When i remove "banner" here everything works as expected, at least it seams so. Why is the banner called here, and not the "normal whole" opnsense-shell?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

rene-bayer commented 4 years ago

I still try to understand all the rc stuff from freebsd ...

when /usr/local/etc/rc is executed on boot, /usr/local/sbin/opnsense-shell banner is executed, this script just ends with exit code 0. So everything is fine at this point, but theres nothing whats getting triggerd after that, i think we have to search in this direction...

Why is nothing executed after successfully finishing /usr/local/etc/rc?

Shouldn't there be a login prompt after that?

fichtner commented 4 years ago

Well, as I said init is done and rc exited zero. There’s no issue except no login prompt appears which is related to tty not spawning due to /etc/ttys misalignment. Try to set serial as primary console and see if that is enough.... 😉

On 8. Nov 2019, at 15:05, René Losert notifications@github.com wrote:

 I still try to understand all the rc stuff from freebsd ...

when /usr/local/etc/rc is executed on boot, /usr/local/sbin/opnsense-shell banner is executed, this script just ends with exit code 0. So everything is fine at this point, but theres nothing whats getting triggerd after that, i think we have to search in this direction...

Why is nothing executed after successfully finishing /usr/local/etc/rc?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

rene-bayer commented 4 years ago

Slowly this all make sense to me :-)

Tried with the following settings: image ... still no luck :-(

nekoprog commented 4 years ago

I still try to understand all the rc stuff from freebsd ...

when /usr/local/etc/rc is executed on boot, /usr/local/sbin/opnsense-shell banner is executed, this script just ends with exit code 0. So everything is fine at this point, but theres nothing whats getting triggerd after that, i think we have to search in this direction...

Why is nothing executed after successfully finishing /usr/local/etc/rc?

Shouldn't there be a login prompt after that?

It's not about freebsd, but it's opnsense feature. It should not prompt you shell after successfull login, but you'll be prompted a cli dashboard to manage opnsense via cli console. There will be a menu to enter a shell, which you have to enter number 8 if I'm not mistaken.

It's same with pfsense.

For this purpose, I think you should try to start debug from here to find out why cli menu is not present after banner. Make sure to comment out this line. You could try change this to /usr/local/sbin/opnsense-shell and boot again, see if anything changes.

rene-bayer commented 4 years ago

@nekoprog yes, if i change this line to /usr/local/sbin/opnsense-shell everything works as expected. The interesting thing we need to troubleshoot is, why the login prompt isnt showing up.

As @fichtner already said, this should be /dev/tty* related, but i have no idea how to identify the problem at the moment.

@fichtner , other question, i have to implement the whole switch configuration in the gui, do you have tips for me where i should start? These are completely new interfaces which i (or we) have to bring up in the gui. On the FreeBSD site everything is set up and working, so its really only the OPNsense (+gui) -stuff Should i create a separate issue for that?

fichtner commented 4 years ago

If you have the time to work on it a plugins.git issue would be good. We’re trying to get the device code to be fully plugin-ready at the moment so that is a good target to work on. Also, most won’t need the switch support so it’s better left out if the core itself.

Take a look at the new VXLAN code in core.git master which is how you can get started.

Cheers, Franco

On 8. Nov 2019, at 17:40, René Losert notifications@github.com wrote:

 @nekoprog yes, if i change this line to /usr/local/sbin/opnsense-shell everything works as expected. The interesting thing we need to troubleshoot is, why the login prompt isnt showing up.

As @fichtner already said, this should be /dev/tty* related, but i have no idea how to identify the problem at the moment.

@fichtner , other question, i have to implement the whole switch configuration in the gui, do you have tips for me where i should start? These are completely new interfaces which i (or we) have to bring up in the gui. On the FreeBSD site everything is set up and working, so its really only the OPNsense (+gui) -stuff Should i create a separate issue for that?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

rene-bayer commented 4 years ago

First i want to fix the open issues i am facing at the moment.

First, and most annoying is still that there's no login prompt.

I now know, that the pfsense image is definitely using /dev/ttyu0 and /dev/ttyu0 is also there in opnsense, so where to start troubleshooting?

fichtner commented 4 years ago

@DarkSunOne it might be a good opportunity to ask the manufacturer what is needed to make that happen

rene-bayer commented 4 years ago

So i should ask netgate, why opnsense doesn't show me a login prompt on there branded hardware? :-D

fichtner commented 4 years ago

Why not, assuming FreeBSD has the same issue they know how to fix it.