apple / tensorflow_macos

TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
Other
3.66k stars 308 forks source link

Transfer Learning with EfficientNetB7 exited with Error #36

Open gowtamvamsi opened 3 years ago

gowtamvamsi commented 3 years ago

Error:

warnings.warn('`Model.fit_generator` is deprecated and '
2020-11-23 14:56:55.114107: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
Epoch 1/10
Error: command buffer exited with error status.
    The Metal Performance Shaders operations encoded on it may not have completed.
    Error: 
    (null)
    Internal Error (IOAF code -536870211)
    <GFX10_MtlCmdBuffer: 0x7fd02a008200>
    label = <none> 
    device = <GFX10_MtlDevice: 0x7fd0222f9000>
        name = AMD Radeon Pro 5500M 
    commandQueue = <GFXAAMD_MtlCmdQueue: 0x7fd02f9e7b20>
        label = <none> 
        device = <GFX10_MtlDevice: 0x7fd0222f9000>
            name = AMD Radeon Pro 5500M 
    retainedReferences = 1
/AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MetalPerformanceShaders-124.0.30/MPSNeuralNetwork/Filters/MPSCNNKernel.mm:1335: failed assertion `[MPSCNNBatchNormalizationStatistics encodeBatchToCommandBuffer:sourceImages:inStates:destinationImages:] Error: the source image texture is uninitialized.
    This typically means that nothing has written to it yet, and its contents are undefined.
<MPSImage: 0x7fd030974190> ""
    device: 0x7fd02f429200 "AMD Radeon Pro 5500M"
    width: 150
    height: 150
    featureChannelsPerImage: 288
    numberOfImages: 1
    MTLPixelFormat: MTLPixelFormatRGBA32Float
    feature channel format: MPSImageFeatureChannelFormatFloat32
    parent:  0x0
    texture: 0x0

'

Code used:

#import cv2
import pandas as pd
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
from tensorflow.python.compiler.mlcompute import mlcompute

mlcompute.set_mlc_device(device_name='gpu')

# img = cv2.imread('cassava-leaf-disease-classification/train_images/3988625744.jpg')
# img.shape

df = pd.read_csv('cassava-leaf-disease-classification/train.csv')
labels_dict = {
  0: "Cassava Bacterial Blight (CBB)",
  1: "Cassava Brown Streak Disease (CBSD)",
  2: "Cassava Green Mottle (CGM",
    3: "Cassava Mosaic Disease (CMD)",
    4:"Healthy"
}

df.head()
df['label'] = df['label'].apply(lambda x: labels_dict[x])

datagen = ImageDataGenerator(
    featurewise_center=False,
    samplewise_center=False,
    featurewise_std_normalization=False,
    samplewise_std_normalization=False,
    zca_whitening=False,
    zca_epsilon=1e-06,
    rotation_range=0,
    width_shift_range=0.0,
    height_shift_range=0.0,
    brightness_range=None,
    shear_range=0.0,
    zoom_range=0.0,
    channel_shift_range=0.0,
    fill_mode="nearest",
    cval=0.0,
    horizontal_flip=False,
    vertical_flip=False,
    rescale=None,
    preprocessing_function=None,
    data_format=None,
    validation_split=0.0,
    dtype=None,
)

IMG_SIZE = 600
NUM_CLASSES = 5

train_generator = datagen.flow_from_dataframe(dataframe= df,
                                              directory='cassava-leaf-disease-classification/train_images/',
                                              x_col = 'image_id',
                                              y_col = 'label',
                                              subset = 'training',
                                              batch_size=32,
                                              seed=42,
                                              shuffle=True,
                                              class_mode = 'categorical',
                                              target_size=(IMG_SIZE,IMG_SIZE))

img_augmentation = Sequential(
    [
        preprocessing.RandomRotation(factor=0.15),
        preprocessing.RandomTranslation(height_factor=0.1, width_factor=0.1),
        preprocessing.RandomFlip(),
        preprocessing.RandomContrast(factor=0.1),
    ],
    name="img_augmentation",
)

from tensorflow.keras.applications import EfficientNetB7

inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
x = img_augmentation(inputs)
model = EfficientNetB7(include_top=False, input_tensor=x, weights="imagenet")

model.trainable = False

x = layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
x = layers.BatchNormalization()(x)

top_dropout_rate = 0.2
x = layers.Dropout(top_dropout_rate, name="top_dropout")(x)
outputs = layers.Dense(NUM_CLASSES, activation="softmax", name="pred")(x)

model = tf.keras.Model(inputs, outputs, name="EfficientNet")
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-2)
model.compile(
    optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"]
)

# model.layers[-6].trainable= True
# model.layers[-7].trainable= True
# model.layers[-10].trainable= True
# model.layers[-11].trainable= True

model.summary()

model.fit_generator(generator=train_generator,
                    steps_per_epoch=512,
                    epochs=10)

model.save('saved_model_v1.h5')
anna-tikhonova commented 3 years ago

Could you point me to the dataset you are using to run this example? Thank you!

gowtamvamsi commented 3 years ago

Could you point me to the dataset you are using to run this example? Thank you!

https://www.kaggle.com/c/cassava-leaf-disease-classification/data

anna-tikhonova commented 3 years ago

Could you point me to the dataset you are using to run this example? Thank you!

https://www.kaggle.com/c/cassava-leaf-disease-classification/data

Thank you! Could you also tell me which config you are running on?

gowtamvamsi commented 3 years ago

Could you point me to the dataset you are using to run this example? Thank you!

https://www.kaggle.com/c/cassava-leaf-disease-classification/data

Thank you! Could you also tell me which config you are running on?

System specs:

(tensorflow_macos_venv) sh-3.2# df -h
Filesystem       Size   Used  Avail Capacity iused      ifree %iused  Mounted on
/dev/disk1s1s1  932Gi   14Gi  707Gi     2%  563932 9767414228    0%   /
devfs           193Ki  193Ki    0Bi   100%     668          0  100%   /dev
/dev/disk1s5    932Gi  2.0Gi  707Gi     1%       2 9767978158    0%   /System/Volumes/VM
/dev/disk1s3    932Gi  355Mi  707Gi     1%    1724 9767976436    0%   /System/Volumes/Preboot
/dev/disk1s6    932Gi  2.1Mi  707Gi     1%      13 9767978147    0%   /System/Volumes/Update
/dev/disk1s2    932Gi  208Gi  707Gi    23% 2328403 9765649757    0%   /System/Volumes/Data
map auto_home     0Bi    0Bi    0Bi   100%       0          0  100%   /System/Volumes/Data/home
(tensorflow_macos_venv) sh-3.2# sysctl -a | grep machdep.cpu
machdep.cpu.max_basic: 22
machdep.cpu.max_ext: 2147483656
machdep.cpu.vendor: GenuineIntel
machdep.cpu.brand_string: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
machdep.cpu.family: 6
machdep.cpu.model: 158
machdep.cpu.extmodel: 9
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 13
machdep.cpu.feature_bits: 9221960262849657855
machdep.cpu.leaf7_feature_bits: 43804591 1073741824
machdep.cpu.leaf7_feature_bits_edx: 3154118144
machdep.cpu.extfeature_bits: 1241984796928
machdep.cpu.signature: 591597
machdep.cpu.brand: 0
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SGX BMI1 AVX2 SMEP BMI2 ERMS INVPCID FPU_CSDS MPX RDSEED ADX SMAP CLFSOPT IPT SGXLC MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD
machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI
machdep.cpu.logical_per_package: 16
machdep.cpu.cores_per_package: 8
machdep.cpu.microcode_version: 214
machdep.cpu.processor_flag: 5
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.linesize_max: 64
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.sub_Cstates: 286531872
machdep.cpu.thermal.sensor: 1
machdep.cpu.thermal.dynamic_acceleration: 1
machdep.cpu.thermal.invariant_APIC_timer: 1
machdep.cpu.thermal.thresholds: 2
machdep.cpu.thermal.ACNT_MCNT: 1
machdep.cpu.thermal.core_power_limits: 1
machdep.cpu.thermal.fine_grain_clock_mod: 1
machdep.cpu.thermal.package_thermal_intr: 1
machdep.cpu.thermal.hardware_feedback: 0
machdep.cpu.thermal.energy_policy: 1
machdep.cpu.xsave.extended_state: 31 832 1088 0
machdep.cpu.xsave.extended_state1: 15 832 256 0
machdep.cpu.arch_perf.version: 4
machdep.cpu.arch_perf.number: 4
machdep.cpu.arch_perf.width: 48
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.events: 0
machdep.cpu.arch_perf.fixed_number: 3
machdep.cpu.arch_perf.fixed_width: 48
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.L2_associativity: 4
machdep.cpu.cache.size: 256
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tlb.data.small: 64
machdep.cpu.tlb.data.small_level1: 64
machdep.cpu.address_bits.physical: 39
machdep.cpu.address_bits.virtual: 48
machdep.cpu.core_count: 8
machdep.cpu.thread_count: 16
machdep.cpu.tsc_ccc.numerator: 192
machdep.cpu.tsc_ccc.denominator: 2
(tensorflow_macos_venv) sh-3.2# 

Python version: 3.8

I followed the rest of the steps as mentioned in the README.md

anna-tikhonova commented 3 years ago

@gowtamvamsi This is a large model. We were able to run your example locally using a smaller batch size. Specifically, we used a config with 4GB of GPU VRAM and found that the batch size of 2 worked well. Could you try a smaller batch size on your end? Thank you!