jpsember / java-ml

Java classes for machine learning
0 stars 0 forks source link

Row/column/channel ordering may be incorrect #41

Closed jpsember closed 2 years ago

jpsember commented 2 years ago

The pixel format of the images as I am writing them out from ml does not necessarily agree with what the Python code expects.

jpsember commented 2 years ago

This is the artificial image set read by Python:

(32, 240)
[   2    1    0    5    4    3    8    7    6   11   10    9   14   13
   12   17   16   15   20   19   18   23   22   21   26   25   24   29
   28   27   32   31   30   35   34   33   38   37   36   41   40   39
   44   43   42   47   46   45   50   49   48   53   52   51   56   55
   54   59   58   57   62   61   60   65   64   63   68   67   66   71
   70   69   74   73   72   77   76   75   80   79   78   83   82   81
   86   85   84   89   88   87   92   91   90   95   94   93   98   97
   96  101  100   99  104  103  102  107  106  105  110  109  108  113
  112  111  116  115  114  119  118  117  122  121  120  125  124  123
 -128  127  126 -125 -126 -127 -122 -123 -124 -119 -120 -121 -116 -117
 -118 -113 -114 -115 -110 -111 -112 -107 -108 -109 -104 -105 -106 -101
 -102 -103  -98  -99 -100  -95  -96  -97  -92  -93  -94  -89  -90  -91
  -86  -87  -88  -83  -84  -85  -80  -81  -82  -77  -78  -79  -74  -75
  -76  -71  -72  -73  -68  -69  -70  -65  -66  -67  -62  -63  -64  -59
  -60  -61  -56  -57  -58  -53  -54  -55  -50  -51  -52  -47  -48  -49
  -44  -45  -46  -41  -42  -43  -38  -39  -40  -35  -36  -37  -32  -33
  -34  -29  -30  -31  -26  -27  -28  -23  -24  -25  -20  -21  -22  -17
  -18  -19]
reshaped to input volume: (32, 8, 10, 3)
[[[   2    1    0]
  [   5    4    3]
  [   8    7    6]
  [  11   10    9]
  [  14   13   12]
  [  17   16   15]
  [  20   19   18]
  [  23   22   21]
  [  26   25   24]
  [  29   28   27]]

 [[  32   31   30]
  [  35   34   33]
  [  38   37   36]
  [  41   40   39]
  [  44   43   42]
  [  47   46   45]
  [  50   49   48]
  [  53   52   51]
  [  56   55   54]
  [  59   58   57]]

 [[  62   61   60]
  [  65   64   63]
  [  68   67   66]
  [  71   70   69]
  [  74   73   72]
  [  77   76   75]
  [  80   79   78]
  [  83   82   81]
  [  86   85   84]
  [  89   88   87]]

 [[  92   91   90]
  [  95   94   93]
  [  98   97   96]
  [ 101  100   99]
  [ 104  103  102]
  [ 107  106  105]
  [ 110  109  108]
  [ 113  112  111]
  [ 116  115  114]
  [ 119  118  117]]

 [[ 122  121  120]
  [ 125  124  123]
  [-128  127  126]
  [-125 -126 -127]
  [-122 -123 -124]
  [-119 -120 -121]
  [-116 -117 -118]
  [-113 -114 -115]
  [-110 -111 -112]
  [-107 -108 -109]]

 [[-104 -105 -106]
  [-101 -102 -103]
  [ -98  -99 -100]
  [ -95  -96  -97]
  [ -92  -93  -94]
  [ -89  -90  -91]
  [ -86  -87  -88]
  [ -83  -84  -85]
  [ -80  -81  -82]
  [ -77  -78  -79]]

 [[ -74  -75  -76]
  [ -71  -72  -73]
  [ -68  -69  -70]
  [ -65  -66  -67]
  [ -62  -63  -64]
  [ -59  -60  -61]
  [ -56  -57  -58]
  [ -53  -54  -55]
  [ -50  -51  -52]
  [ -47  -48  -49]]

 [[ -44  -45  -46]
  [ -41  -42  -43]
  [ -38  -39  -40]
  [ -35  -36  -37]
  [ -32  -33  -34]
  [ -29  -30  -31]
  [ -26  -27  -28]
  [ -23  -24  -25]
  [ -20  -21  -22]
  [ -17  -18  -19]]]
...at (js_train.py  277):printed numpy array of bytes read from filesystem
jpsember commented 2 years ago

In PyTorch, images are represented as [channels, height, width], so a color image would be [3, 256, 256]. During the training you will get batches of images, so your shape in the forward method will get an additional batch dimension at dim0: [batch_size, channels, height, width].

https://discuss.pytorch.org/t/dimensions-of-an-input-image/19439

I think I need to permute the tensor read from disk so the channel dimension is earlier.