Inconsistent predictions between models

kyrozepto commented 1 day ago

Different predictions between initial model performance and its deployment. My first guess was a potential data type mismatch during image preprocessing in handler.py:

class WastePredictionServicer:
    def predict(self, image_bytes: bytes):
        try:
            image = Image.open(io.BytesIO(image_bytes)).resize((224, 224))
            image_array = np.expand_dims(np.array(image) / 255.0, axis=0)

try changing it to

image_array = np.expand_dims(np.array(image, dtype=np.uint8), axis=0)

x1nx3r commented 1 day ago

Your guess is absolutely spot on, sorry for the oversight. I'll fix this up real quick and notify you when it's done. Thank you for the input

x1nx3r commented 1 day ago

I've applied the fix and deployed it so it should be online now. From my little testing it looks like it's working fine now but I'll be glad if you're able to test it out from your side.

kyrozepto commented 22 hours ago

Thanks for the fix. The previous dtype adjustment has slightly improved the predictions in the deployed environment, but the model predictions still show differences compared to the results tested on Kaggle.

When Metal_10.jpg is tested in the deployed environment, it returns the following prediction:

{
  "imageUrl": "#####",
  "predicted_class": "logam",
  "waste_type": "anorganik",
  "probabilities": {
    "buah_sayuran": 0.00000935040952754207,
    "daun": 0.00001280352626054082,
    "elektronik": 0.00000842738700157497,
    "kaca": 0.0003416250692680478,
    "kertas": 0.0000979607502813451,
    "logam": 0.5288726091384888,
    "makanan": 0.00001739530125632882,
    "medis": 0.47051578760147095,
    "plastik": 0.00006641953950747848,
    "tekstil": 0.000057631998060969636
  }
}

While the model being tested on Kaggle, returns the following prediction:

1/1 [==============================] - 0s 133ms/step
{'predicted_class': 'logam', 'waste_type': 'anorganik', 'probabilities': {'buah_sayuran': 0.00015537232684437186, 'daun': 2.2661552065983415e-05, 'elektronik': 3.881779048242606e-05, 'kaca': 2.870157732104417e-05, 'kertas': 5.2285584388300776e-05, 'logam': 0.9991063475608826, 'makanan': 4.0144528611563146e-05, 'medis': 0.00022748325136490166, 'plastik': 1.5667666275476222e-06, 'tekstil': 0.000326696434058249}}

with the similar code,

def load_image_as_bytes(image_path: str):
    """Load an image from the file system and return it as bytes."""
    with open(image_path, 'rb') as f:
        image_bytes = f.read()
    return image_bytes

def predict_image_details(model, image_path, class_labels, waste_types):
    try:
        image_bytes = load_image_as_bytes(image_path)

        image = Image.open(io.BytesIO(image_bytes)).resize((224, 224))
        image_array = np.expand_dims(np.array(image, dtype=np.uint8), axis=0)

        prediction = model.predict(image_array)
        predicted_index = np.argmax(prediction[0])
        predicted_class = class_labels[predicted_index]
        waste_type = waste_types[predicted_class]

        response = {
            "predicted_class": predicted_class,
            "waste_type": waste_type,
            "probabilities": {
                class_labels[i]: float(prob) for i, prob in enumerate(prediction[0])
            }
        }
        return response
    except Exception as e:
        return {"error": str(e)}

image_path = "/kaggle/input/realwaste/realwaste-main/RealWaste/Metal/Metal_10.jpg"
details = predict_image_details(model, image_path, class_labels, waste_types)
print(details)

The model being tested on Kaggle returns a 99% confidence score, when the one tested in deployed environment showing 53% confidence score.

I would like to make this issue remain open to identify the cause of the prediction differences and improve the model accuracy further.

Bin-Detective / bindetective-ML-backend

Inconsistent predictions between models #1