fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.21k stars 396 forks source link

Reports not found! #1030

Open aleenaelsageorge12 opened 1 month ago

aleenaelsageorge12 commented 1 month ago

I was running the sample code found on https://github.com/fastmachinelearning/hls4ml-tutorial, and the following error occurred. The spec of my setup are the following:

OS ;Ubuntu 18.04.2 Vivado hls4ml 2019.2 python 3.9

Following is the LSTM model we used.

import myutils from datetime import datetime import sys import os import pickle from keras.models import load_model from gensim.models import Word2Vec, KeyedVectors from keras.preprocessing import sequence from sklearn.metrics import accuracy_score from sklearn.metrics import precision_score from sklearn.metrics import recall_score from sklearn.metrics import f1_score import tensorflow as tf import numpy

import os import tensorflow as tf os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import plotting import matplotlib.pyplot as plt from sklearn.metrics import accuracy_score

import hls4ml import numpy as np import os

os.environ['PATH'] += os.pathsep + '/home/sh/Vivado/2019.2/bin'

default mode / type of vulnerability

mode = "xss"

get the vulnerability from the command line argument

if (len(sys.argv) > 1): mode = sys.argv[1]

model = load_model('modellstm/LSTMmodel'+mode+'.h5',custom_objects={'f1_loss': myutils.f1_loss, 'f1':myutils.f1})

with open('data/' + mode + '_dataset_finaltest_X', 'rb') as fp: FinaltestX = pickle.load(fp) with open('data/' + mode + '_dataset_finaltest_Y', 'rb') as fp: FinaltestY = pickle.load(fp)

now = datetime.now() # current date and time nowformat = now.strftime("%H:%M")

Prepare the data for the LSTM model

X_finaltest = numpy.array(FinaltestX, dtype="object") y_finaltest = numpy.array(FinaltestY, dtype="object")

in the original collection of data, the 0 and 1 were used the other way round, so now they are switched so that "1" means vulnerable and "0" means clean.

for i in range(len(y_finaltest)): if y_finaltest[i] == 0: y_finaltest[i] = 1 else: y_finaltest[i] = 0

now = datetime.now() # current date and time nowformat = now.strftime("%H:%M")

print(str(len(X_finaltest)) + " samples in the final test set.")

csum = 0 for y in y_finaltest: csum = csum+y

print("percentage of vulnerable samples: " + str(int((csum / len(X_finaltest)) * 10000)/100) + "%") print("absolute amount of vulnerable samples in test set: " + str(csum))

padding sequences on the same length

max_length = 200
X_finaltest = sequence.pad_sequences(X_finaltest, maxlen=max_length)

X_finaltest = numpy.asarray(X_finaltest).astype(numpy.float32) y_finaltest= numpy.asarray(y_finaltest).astype(numpy.float32)

yhat_classes = (model.predict(X_finaltest) > 0.5).astype("int32")

accuracy = accuracy_score(y_finaltest, yhat_classes)

precision = precision_score(y_finaltest, yhat_classes) recall = recall_score(y_finaltest, yhat_classes) F1Score = f1_score(y_finaltest, yhat_classes)

print("keras Accuracy: " + str(accuracy)) print("keras Precision: " + str(precision)) print("keras Recall: " + str(recall)) print('keras F1 score: %f' % F1Score) print("\n")

config = hls4ml.utils.config_from_keras_model(model, granularity='model') print("-----------------------------------") print("Configuration") plotting.print_dict(config) print("-----------------------------------") hls_model = hls4ml.converters.convert_from_keras_model( model, hls_config=config, output_dir='model_1/hls4ml_prj', part='xcu250-figd2104-2L-e' )

hls_model.compile()

X_finaltest = np.ascontiguousarray(X_finaltest)

y_hls = (hls_model.predict(X_finaltest) > 0.5).astype("int32")

accuracyhls = accuracy_score(y_finaltest, y_hls)

print("hls Accuracy: " + str(accuracyhls)) print("keras Accuracy: " + str(accuracy))

hls_model.build(csim=True)

hls4ml.report.read_vivado_report('model_1/hls4ml_prj/')

This is the output we get. Why co-simulation not found?

2024-07-07 22:33:41.106293: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-07-07 22:33:41.543981: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-07-07 22:33:41.545071: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-07-07 22:33:43.163030: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT /home/sh/anaconda3/lib/python3.9/site-packages/hls4ml/converters/init.py:27: UserWarning: WARNING: Pytorch converter is not enabled! warnings.warn("WARNING: Pytorch converter is not enabled!", stacklevel=1) WARNING: Failed to import handlers from pooling.py: No module named 'torch'. WARNING: Failed to import handlers from core.py: No module named 'torch'. WARNING: Failed to import handlers from reshape.py: No module named 'torch'. WARNING: Failed to import handlers from merge.py: No module named 'torch'. WARNING: Failed to import handlers from convolution.py: No module named 'torch'. 8277 samples in the final test set. percentage of vulnerable samples: 8.98% absolute amount of vulnerable samples in test set: 744 2024-07-07 22:34:12.832143: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 1986480000 exceeds 10% of free system memory. 259/259 [==============================] - 23s 77ms/step keras Accuracy: 0.9098707261084934 keras Precision: 0.49876543209876545 keras Recall: 0.543010752688172 keras F1 score: 0.519949

Interpreting Sequential Topology: Layer name: lstm_1_input, layer type: InputLayer, input shapes: [[None, 200, 300]], output shape: [None, 200, 300] Layer name: lstm_1, layer type: LSTM, input shapes: [[None, 200, 300]], output shape: [None, 100] Layer name: dense_1, layer type: Dense, input shapes: [[None, 100]], output shape: [None, 1]

Configuration Model Precision: fixed<16,6> ReuseFactor: 1 Strategy: Latency BramFactor: 1000000000 TraceOutput: False

Interpreting Sequential Topology: Layer name: lstm_1_input, layer type: InputLayer, input shapes: [[None, 200, 300]], output shape: [None, 200, 300] Layer name: lstm_1, layer type: LSTM, input shapes: [[None, 200, 300]], output shape: [None, 100] Layer name: dense_1, layer type: Dense, input shapes: [[None, 100]], output shape: [None, 1] Creating HLS model Writing HLS project /home/sh/anaconda3/lib/python3.9/site-packages/keras/src/engine/training.py:3000: UserWarning: You are saving your model as an HDF5 file via model.save(). This file format is considered legacy. We recommend using instead the native Keras format, e.g. model.save('my_model.keras'). saving_api.save_model( Done hls Accuracy: 0.90515887398816 keras Accuracy: 0.9098707261084934

** Vivado(TM) HLS - High-Level Synthesis from C, C++ and SystemC v2019.2 (64-bit) SW Build 2708876 on Wed Nov 6 21:39:14 MST 2019 IP Build 2700528 on Thu Nov 7 00:09:20 MST 2019 ** Copyright 1986-2019 Xilinx, Inc. All Rights Reserved.

source /home/sh/Vivado/2019.2/scripts/vivado_hls/hls.tcl -notrace INFO: [HLS 200-10] Running '/home/sh/Vivado/2019.2/bin/unwrapped/lnx64.o/vivado_hls' INFO: [HLS 200-10] For user 'sh' on host 'sh-VirtualBox' (Linux_x86_64 version 5.4.0-150-generic) on Sun Jul 07 23:33:18 CEST 2024 INFO: [HLS 200-10] On os Ubuntu 18.04.6 LTS INFO: [HLS 200-10] In directory '/home/sh/model_1/hls4ml_prj' Sourcing Tcl script 'build_prj.tcl' INFO: [HLS 200-10] Opening project '/home/sh/model_1/hls4ml_prj/myproject_prj'. INFO: [HLS 200-10] Adding design file 'firmware/myproject.cpp' to the project INFO: [HLS 200-10] Adding test bench file 'myproject_test.cpp' to the project INFO: [HLS 200-10] Adding test bench file 'firmware/weights' to the project INFO: [HLS 200-10] Adding test bench file 'tb_data' to the project INFO: [HLS 200-10] Opening solution '/home/sh/model_1/hls4ml_prj/myproject_prj/solution1'. INFO: [SYN 201-201] Setting up clock 'default' with a period of 5ns. INFO: [SYN 201-201] Setting up clock 'default' with an uncertainty of 0.625ns. INFO: [HLS 200-10] Setting target device to 'xcu250-figd2104-2L-e' INFO: [XFORM 203-101] Allowed max sub elements number after partition is 4096. INFO: [XFORM 203-1161] The maximum of name length is set into 80. INFO: [XFORM 203-101] Allowed max sub elements number after partition is 4096. INFO: [XFORM 203-1161] The maximum of name length is set into 80. C/RTL SYNTHESIS INFO: [SCHED 204-61] Option 'relax_ii_for_timing' is enabled, will increase II to preserve clock frequency constraints. INFO: [HLS 200-10] Analyzing design file 'firmware/myproject.cpp' ... INFO: [HLS 200-10] Analyzing design file 'firmware/myproject_axi.cpp' ... INFO: [HLS 200-111] Finished Linking Time (s): cpu = 00:01:15 ; elapsed = 00:01:24 . Memory (MB): peak = 908.145 ; gain = 459.035 ; free physical = 584 ; free virtual = 1714 INFO: [HLS 200-111] Finished Checking Pragmas Time (s): cpu = 00:01:15 ; elapsed = 00:01:24 . Memory (MB): peak = 908.145 ; gain = 459.035 ; free physical = 584 ; free virtual = 1714 INFO: [HLS 200-10] Starting code transformations ... INFO: [XFORM 203-603] Inlining function 'nnet::product::mult<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0> >::product' into 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_1>' (firmware/nnet_utils/nnet_dense_latency.h:42). INFO: [XFORM 203-603] Inlining function 'nnet::product::mult<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0> >::product' into 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_2>' (firmware/nnet_utils/nnet_dense_latency.h:42). INFO: [XFORM 203-603] Inlining function 'nnet::product::mult<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0> >::product' into 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config3>' (firmware/nnet_utils/nnet_dense_latency.h:42). INFO: [XFORM 203-603] Inlining function 'nnet::dense<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_1>' into 'nnet::lstm_static<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' (firmware/nnet_utils/nnet_recurrent.h:149). INFO: [XFORM 203-603] Inlining function 'nnet::dense<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_2>' into 'nnet::lstm_static<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' (firmware/nnet_utils/nnet_recurrent.h:150). INFO: [XFORM 203-603] Inlining function 'nnet::dense<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config3>' into 'myproject' (firmware/myproject.cpp:43). INFO: [HLS 200-111] Finished Standard Transforms Time (s): cpu = 00:10:17 ; elapsed = 00:10:40 . Memory (MB): peak = 1644.148 ; gain = 1195.039 ; free physical = 155 ; free virtual = 994 INFO: [HLS 200-10] Checking synthesizability ... INFO: [XFORM 203-602] Inlining function 'nnet::cast<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_1>' into 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_1>' (firmware/nnet_utils/nnet_dense_latency.h:66) automatically. INFO: [XFORM 203-602] Inlining function 'nnet::cast<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_2>' into 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_2>' (firmware/nnet_utils/nnet_dense_latency.h:66) automatically. INFO: [XFORM 203-602] Inlining function 'nnet::activation::sigmoid<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, sigmoid_config2_recr>::activation' into 'nnet::lstm_static<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' (firmware/nnet_utils/nnet_recurrent.h:166) automatically. INFO: [XFORM 203-602] Inlining function 'nnet::activation::tanh<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, tanh_config2>::activation' into 'nnet::lstm_static<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' (firmware/nnet_utils/nnet_recurrent.h:170) automatically. INFO: [XFORM 203-602] Inlining function 'nnet::cast<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config3>' into 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config3>' (firmware/nnet_utils/nnet_dense_latency.h:66) automatically. INFO: [HLS 200-111] Finished Checking Synthesizability Time (s): cpu = 00:10:19 ; elapsed = 00:10:42 . Memory (MB): peak = 1644.148 ; gain = 1195.039 ; free physical = 152 ; free virtual = 994 INFO: [XFORM 203-502] Unrolling all loops for pipelining in function 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config3>' (firmware/nnet_utils/nnet_dense_latency.h:13:43). INFO: [XFORM 203-502] Unrolling all loops for pipelining in function 'nnet::lstm_stack<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' (firmware/nnet_utils/nnet_recurrent.h:199:45). INFO: [XFORM 203-502] Unrolling all loops for pipelining in function 'nnet::lstm_static<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' (firmware/nnet_utils/nnet_recurrent.h:41:50). INFO: [XFORM 203-502] Unrolling all loops for pipelining in function 'nnet::tanh<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, tanh_config2>' (firmware/nnet_utils/nnet_activation.h:427:43). INFO: [XFORM 203-502] Unrolling all loops for pipelining in function 'nnet::sigmoid<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, sigmoid_config2_recr>' (firmware/nnet_utils/nnet_activation.h:109:43). INFO: [XFORM 203-502] Unrolling all loops for pipelining in function 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_2>' (firmware/nnet_utils/nnet_dense_latency.h:17:48). INFO: [XFORM 203-502] Unrolling all loops for pipelining in function 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2_1>' (firmware/nnet_utils/nnet_dense_latency.h:17:48). INFO: [HLS 200-489] Unrolling loop 'Product1' (firmware/nnet_utils/nnet_dense_latency.h:37) in function 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config3>' completely with a factor of 100. INFO: [HLS 200-489] Unrolling loop 'Accum1' (firmware/nnet_utils/nnet_dense_latency.h:54) in function 'nnet::dense_latency<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config3>' completely with a factor of 100. INFO: [HLS 200-489] Unrolling loop 'Loop-1' (firmware/nnet_utils/nnet_recurrent.h:205) in function 'nnet::lstm_stack<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' completely with a factor of 100. INFO: [HLS 200-489] Unrolling loop 'Loop-2' (firmware/nnet_utils/nnet_recurrent.h:210) in function 'nnet::lstm_stack<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' completely with a factor of 200. ERROR: [XFORM 203-504] Stop unrolling loop 'Loop-2' (firmware/nnet_utils/nnet_recurrent.h:210) in function 'nnet::lstm_stack<ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>' because it may cause large runtime and excessive memory usage due to increase in code size. Please avoid unrolling the loop or form sub-functions for code in the loop body. ERROR: [HLS 200-70] Pre-synthesis failed. command 'ap_source' returned error code while executing "source build_prj.tcl" ("uplevel" body line 1) invoked from within "uplevel #0 [list source $arg] "

INFO: [Common 17-206] Exiting vivado_hls at Sun Jul 7 23:44:02 2024... CSynthesis report not found. Vivado synthesis report not found. Cosim report not found. Timing report not found. Found 1 solution(s) in model_1/hls4ml_prj//myproject_prj. Reports for solution "solution1":

C SIMULATION RESULT: INFO: [SIM 2] CSIM start INFO: [SIM 4] CSIM will launch GCC as the compiler. Compiling ../../../../myproject_test.cpp in debug mode Compiling ../../../../firmware/myproject.cpp in debug mode Compiling ../../../../firmware/myproject_axi.cpp in debug mode Generating csim.exe INFO: Unable to open input/predictions file, using default input. 0.000976563 INFO: Saved inference results to file: tb_data/csim_results.log INFO: [SIM 1] CSim done with 0 errors. INFO: [SIM 3] CSIM finish

Synthesis report not found. Co-simulation report not found.

vloncar commented 1 month ago

Why co-simulation not found? Because you didn't run co-sim. Check the paramters of the hls_model.build().

As for the second error, your model is simply too large.