ucb-bar / gemmini

Berkeley's Spatial Array Generator
Other
806 stars 167 forks source link

resnet50 reuslts wrong? #119

Open zhongpanwu opened 3 years ago

zhongpanwu commented 3 years ago

Hello, I have run the restnet50.c in spike on branch dev. And I found the output is shown as following: fc_54: gemmini Prediction: 0 (score: 0) Prediction: 0 (score: 0) Prediction: 0 (score: 0) Prediction: 0 (score: 0) However, I notice the four correct number in the code is [75, 900, 641, 897].

I am curious if this result is correct or not first? And: what is the reason to have this condition "fc_54_out[preds[i]][i] != fc_54_out[correct[i]][i]" in result judgment statement? Could you provide more details how you implement this Resnet50 protopype?

Also, I am trying to run resnet50-baremetal in emulator(verilator compiled.), after I typed command: ./simulator-chipyard-GemminiRocketConfig imagenet/resnet50-linux-baremetal (I moved the folder under /sims/verilator.). It is just frozen, and cannot show any output, did you try it before, could you guide me how to run it?

many thanks in advance.

hngenc commented 3 years ago

If you're on the dev branch, then there should be a file called SPIKE.hash that gives you the correct Spike commit that you need to be on. Afterwards, run these commands:

cd chipyard/toolchains/esp-tools/riscv-isa-sim/build
git fetch
git checkout $SPIKE_COMMIT
make && make install

Where $SPIKE_COMMIT is whatever is in SPIKE.hash.

Also, you shouldn't run the resnet50 binary in verilator; it will take days to finish. You should run it on Firesim instead.

zhongpanwu commented 3 years ago

Thanks! I am grateful to your fast replication.

I followed your steps and update the spike on my local machine. However, I don't think the problem is solved by updating the spike. the resnet50 can run on my local machine, I saw "Pass" after I ran the program (I have updated the spike). But My question is about why the output of the Fully connected layers (FC layeres) shows me the maximum prob and the index are both zeros? is this normal?

Thanks again

zhongpanwu commented 3 years ago

Thanks, I think I solved the problem by updating the spike (it didn't work initially, but I tried to download the whole chipyard v1.3 and then followed your instructions). My left question is now: what is the correct array: {75, 900, 641, 897}. I know it stands for 1 of 1000 output nodes, but what this truly means? animal, digits? or what? how can I get the input images info?

hngenc commented 3 years ago

We've lost the original images, unfortunately. They were randomly selected from the ImageNet validation dataset.

The numbers correspond to ImageNet1k classes. You could look into the ImageNet dataset to find which number maps to what class.

zhongpanwu commented 3 years ago

@hngenc Hello, I tried to test gemmini's resnet50.c with 4 different images from ImageNet. The preprocessing steps I did were the same as from another repo (https://github.com/mlcommons/inference/blob/master/vision/ classification_and_detection/python/dataset.py) and I have changed to int8. However, I find that the converted data (int8) value is greater than 100, while the original 4 image values in images.h never exceed 100. is this normal? Can you elaborate on how you preprocessed the images? Thanks

hngenc commented 3 years ago

I think that the original four images first passed through the standard preprocessing steps (which you've apparently already run).

Afterwards, the pixel values where multiplied by a constant scaling factor. I don't remember the exact scaling factor, unfortunately. I suppose you could estimate it by looking at the range of the images in images.h.

Also, in general, the resnet50.c binary is just used to get performance numbers and as a proof-of-concept. It wasn't really intended for people to switch in their own images. It should be possible to do that, but we haven't prepared scripts or a build process for that use-case.

hngenc commented 1 year ago

After looking into this further, I believe we ran this pre-processing transform on images from ImageNet:

transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

Afterwards, we ran the function below to scale the floating-point pixel values to integer, and to permute the tensor to the NHWC format (which is what Gemmini expects):

torch.clamp(image * 32.0, min=-128, max=127).permute([1,2,0])
zhongpanwu commented 1 year ago

Thank you~, and are weights and activations quantized in your restnet50 model?

hngenc commented 1 year ago

Yup, the weights and activations are quantized to 8-bit signed integers