Open Jason-xin opened 5 years ago
@wubaoyuan I need your help, thanks!
@Jason-xin Sorry for the late reply. The code and the arxiv are indeed inconsistent. The version in code is the latest one, while the arxiv describes an old version. We will update the r_t^j in arxiv asap.
In the code, (r_t^j)_positive = max(0.01, log10(10/(0.01+t)) < 1; (r_t^j)_negative = max(0.01, log10(10/(8+t)) < 0.1.
The principle of designing r_t^j is monotonically decreasing with respect to t. You can try other decreasing functions in your training.
@Jason-xin For the second question, as the tasks and datasets of the training and fine-tuning are significantly different, it is a natural choice of different pre-processing.
@wubaoyuan OK, another question. In train.py, why tf.nn.softmax is used to calculate probabilities and classes is tf.argmax (means only one tag predicted)? In fact, each sample in ML-Image dataset has multiple tags... `# build model net = resnet.ResNet(features, is_training=(mode == tf.estimator.ModeKeys.TRAIN)) logits = net.build_model() predictions = { 'classes': tf.argmax(logits, axis=1), 'probabilities': tf.nn.softmax(logits, name='softmax_tensor') }
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
`
@wubaoyuan And, I changed the tf.nn.softmax to tf.nn.sigmoid in image_classification.py, and test mlimagenet model, so the result was multi-label classification? I don't know whether it works out or not?
@wubaoyuan with the formula you provided, when t change from 1 to 2, the ratio of weight $r_t$ between positive and negative tag increase from 20 to 70, is this indeed what you mean by monotonically decreasing ? Why not fix r_t^0 ?
The principle of designing r_t^j is monotonically decreasing with respect to t. You can try other decreasing functions in your training.
@wubaoyuan And, I changed the tf.nn.softmax to tf.nn.sigmoid in image_classification.py, and test mlimagenet model, so the result was multi-label classification? I don't know whether it works out or not?
I have the same question with you! While in train.py, I didn't modify the softmax to sigmoid.Only when testing by image_classification, I changed the softmax to sigmoid, and it did work with not bad result. But I am really confused, why not change the softmax to sigmoid in training and only calculate the first class accuracy?
sorry to bother you, I have two questions:
when calculating loss, the first step is "a. get loss coeficiente" and the corresponding codes as follows: Whether it refers to r in loss function: But the explanation of r is not matched with these codes, So, can you tell me what is these codes? Especially for pos_loss_coef(0.01), neg_loss_coef(8) and loss_coef...
In train.py, _record_parserfn make image like
image = image_preprocess.preprocess_image(image=image, output_height=FLAGS.image_size, output_width=FLAGS.image_size, object_cover=0.7, area_cover=0.7, is_training=is_training,, bbox=bbox)
But in finetune.py, record_parser_fn make image likeimage = image_preprocess.preprocess_image(image=image, output_height=FLAGS.image_size, output_width=FLAGS.image_size, object_cover=0.0, area_cover=0.05, is_training=is_training,, bbox=bbox)
Can you tell me why differ in object_cover and area_cover?Thanks!