quantum / esos

An open source, high performance, block-level storage platform.
http://www.esos-project.com/
Other
287 stars 58 forks source link

aacraid: Outstanding Commands on (0,0,1,0) #256

Closed githublti closed 4 years ago

githublti commented 4 years ago

I'm currently building a small SAN that supports 8 drives. The Raid card that I picked up off of ebaY is an Adaptec ASR 71605.

I've updated to the latest posted firmware on Adaptec website.

Initially using ESOS 1.3.10, everything worked well, but decided I should probably moved to the 2.x series.

When booting a fresh build, I got the above errors and the USB will get stuck at the aacraid errors with the numbers in the parentheses incrementing.

Google searching leads me to information that this is an issue with Ubuntu 18 along with suggested kernel patches.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1777586

Is this something that can be incorporated into ESOS?

Please let me know if there is more information that I can provide.

Thanks!

Doug Lytle

msmith626 commented 4 years ago

Do you know if that patch was incorporated into Linux 4.14.x? If so, which version? If you can confirm this patch made it to 4.14.x I can update that. Another option is to try the latest build from the ESOS 'master' branch which uses a 5.4.x kernel.

--Marc

githublti commented 4 years ago

That I do not know; I will do some more searching to see what I can find. I'll also build master to test with,

Doug

githublti commented 4 years ago

Marc,

I did not find any reference to the patches in the 4.14.x series, but when testing Master today, I had no issues with the Adaptec ASR 71605; everything worked as intended.

Doug

msmith626 commented 4 years ago

Okay, thanks for checking that out Doug. The 5.4.x kernel likely uses a newer version of that driver from upstream, sometimes patches don't get back-ported (even for bugs).

In our case, the 'master' branch of ESOS will soon become the new stable/release branch (eg, 3.x.x). There isn't a huge difference between 'master' and '2.x.x' right now, mainly just the kernel version (both LTS kernels).

--Marc

githublti commented 4 years ago

Marc,

I've moved back to 1.3.10 on this setup; I'm unable to get beyond the below errors.

qla2x00t(12): RSCN registration failed: 0x2 (OK for non-fabric setups) qla2x00t(12): CTIO with error status 0x10 received (state 3, scst_cmd)

My test server is a PowerEdge R610 with an up-to-date ESXI 6.5 Free with a QLE2562

From what I've read, the 0x20 error indicate a bus error. I'm using a Ryzen Gigabyte motherboard with the QLE2562 in a 4X slot. This works fine on 1.3.10 without errors. I may swap the Adaptec card from the 16x slot to the 4x slot as a test Master again

I'm making the assumption that, if I ever decide that I need to move to 2.x or 3.x that I will need to do a better job of reviewing hardware before upgrading.

Doug

msmith626 commented 4 years ago

Yeah, that or we track down the issues and resolve them in current kernels. Even in 'master' we are quite a few kernel patch releases behind for 5.4.x. Planning to bump that to the latest patch release soon.

--Marc

On Fri, May 22, 2020 at 9:48 AM githublti notifications@github.com wrote:

Marc,

I've moved back to 1.3.10 on this setup; I'm unable to get beyond the below errors.

qla2x00t(12): RSCN registration failed: 0x2 (OK for non-fabric setups) qla2x00t(12): CTIO with error status 0x10 received (state 3, scst_cmd)

My test server is a PowerEdge R610 with an up-to-date ESXI 6.5 Free with a QLE2562

From what I've read, the 0x20 error indicate a bus error. I'm using a Ryzen Gigabyte motherboard with the QLE2562 in a 4X slot. This works fine on 1.3.10 without errors. I may swap the Adaptec card from the 16x slot to the 4x slot as a test Master again

I'm making the assumption that, if I ever decide that I need to move to 2.x or 3.x that I will need to do a better job of reviewing hardware before upgrading.

Doug

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or unsubscribe.

githublti commented 4 years ago

Marc,

I've installed Master build 2020-06-12T00:08:45.000Z and things seem to be working well so far.

Doug