Xilinx / ml-suite

Getting Started with Xilinx ML Suite
https://aws.amazon.com/marketplace/pp/B077FM2JNS
Other
335 stars 152 forks source link

loading AFI error #22

Closed b04902036 closed 6 years ago

b04902036 commented 6 years ago

I use the Xilinx ML Suite AMI, and I am trying to run the same demo app with the same command "./run.sh aws e2e" in this issue, I also do the source sdaccel_setup.sh after the recomanded modification. However it gives me the following error:

[XBLAS] # kernels: 1
[0]user:0xf010:0x1d51:[xocl:2017.4.5:128]
xclProbe found 1 FPGA slots with xocl driver running
CL_PLATFORM_VENDOR Xilinx
CL_PLATFORM_NAME Xilinx
CL_DEVICE_0: 0x3169fe0
CL_DEVICES_FOUND 1, using 0
loading /home/centos/src/project_data/ml-suite/overlaybins/aws/overlay_3.xclbin
AFI not yet loaded, proceed to download.
ERROR: Failed to create compute program from binary -44

I also tried the Jupyter tutorial provided by ml-suite, it also gives me the error:

loading /home/centos/src/project_data/ml-suite/overlaybins/aws/overlay_3.xclbin
AFI not yet loaded, proceed to download.
ERROR: Failed to create compute program from binary -44

Could someone help me out with this? Sincerely thanks! UPDATE: I use master branch.

b04902036 commented 6 years ago

By the way, shouldn't we first build .xclbin file into .awsxclbin and create AFI first? I haven't seen this part in the provided tutorial. UPDATE: I tried to do this with $SDACCEL_DIR/tools/create_sdaccel_afi.sh provided by this with tag v1.3.6, and it gives me this error log:

{
    "FpgaImages": [
        {
            "UpdateTime": "2018-07-25T08:17:36.000Z", 
            "Name": "../overlay_2.xclbin", 
            "Tags": [], 
            "PciId": {
                "SubsystemVendorId": "0xfedd", 
                "VendorId": "0x1d0f", 
                "DeviceId": "0xf000", 
                "SubsystemId": "0x1d51"
            }, 
            "FpgaImageGlobalId": "agfi-0cb29ac225e5599f8", 
            "Public": false, 
            "State": {
                "Message": "UNKNOWN_BITSTREAM_GENERATE_ERROR: An unexpected error occurred generating the bitstream", 
                "Code": "failed"
            }, 
            "ShellVersion": "0x071417d3", 
            "OwnerId": "532726133341", 
            "FpgaImageId": "afi-06f41586bdba62eb0", 
            "CreateTime": "2018-07-25T08:12:16.000Z", 
            "Description": "../overlay_2.xclbin"
        }
    ]
}

I guess this is of the same problem as this, not sure if this help...

wilderfield commented 6 years ago

The overlays aka "xclbin" files which are stored in ml-suite/overlaybins/aws are in actuality .awsxclbin files. I have already created the AFIs for our users. They should work fine if you are using an instance in us-east. We dropped the .awsxclbin naming convention to make it easier to run all our code on any platform. Sorry about this confusion. Also, can you start from the FPGA Developer AMI for now? We need to update the ml-suite AMI to support the included xclbins. The included xclbins were built for Amazon's 2017.4 DSA shell. https://aws.amazon.com/marketplace/pp/B06VVYBLZZ

b04902036 commented 6 years ago

Really thanks for your help! I currently use an instance in us-west-2, that's probably why I got the error. Is it possible for you to support different region to load your AFIs ? Or I'll have to change my region to us-east to use your product? Thank you !

wilderfield commented 6 years ago

You are welcome! Thanks for your patience in trying out the ml-suite. Once we bundle up the new ml-suite AMI, the region problem will be automatically handled when you launch the AMI. Also, I urge you to do your work on a fresh FPGA Developer AMI to start off with, because that AMI includes the latest Amazon shell, which is compatible with the xclbins we've included in the repo.

On Wed, Jul 25, 2018 at 6:54 PM, b04902036 notifications@github.com wrote:

Really thanks for your help! I currently use an instance in us-west-2, that's probably why I got the error. Is it possible for you to support different region to load your AFIs ? Or I'll have to change my region to us-east to use your product? Thank you !

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Xilinx/ml-suite/issues/22#issuecomment-407950860, or mute the thread https://github.com/notifications/unsubscribe-auth/AO_Gp4fAFV64dtESyhAQnpsAt2QHYWQFks5uKSE_gaJpZM4VfkEQ .

b04902036 commented 6 years ago

Uh, just to comfirm, so I don't have to deal with the region and just launch a fresh FPGA Developer AMI? My instance is in us-west-2, do I need to set a special aws configure to load your AFIs in us-east? Thank you for your patience and answers!

wilderfield commented 6 years ago

Just use us east for now. Is there a major reason you want to be in a different region? If so I can look into testing it tomorrow.

Sent from my iPhone

On Jul 25, 2018, at 7:27 PM, b04902036 notifications@github.com wrote:

Uh, just to comfirm, so I don't have to deal with the region and just launch a fresh FPGA Developer AMI? My instance is in us-west-2, do I need to set a special aws configure to load your AFIs in us-east? Thank you for your patience and answers!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

b04902036 commented 6 years ago

Nevermind, I just wonder if I can load AFIs in different region or not. Thanks for your help anyway! I can now finally successfully load your AFIs! I should close this issue as the problem seems to be not able to load AFIs from different region