Train code - Githubissues

luocmin commented 4 years ago

Does this repository provide training code? I only see the test code

johnwlambert commented 4 years ago

Hi @luocmin, thanks for your interest in our work. I'm still cleaning up all of the training scripts, but all the code should be in this PR to train any model from our paper: https://github.com/mseg-dataset/mseg-semantic/pull/9

In which taxonomy/on which datasets/at which resolution would you like to train? You can find more details here.

luocmin commented 4 years ago

Hi @johnwlambert ,Thank you for helping me solve the question. According to the link you provided, I found that the MSEG-Semantic code I downloaded is different from the code in your link. Since I am a white person, it is not clear about the training, so I also want to ask whether the training code is based on the code in semseG project mentioned in the project. me: you:

johnwlambert commented 4 years ago

Hi @luocmin, I didn't quite understand your question -- do you mind explaining a bit more about the issue you are facing?

The training script in the semseg repo is a starting point for our work, but we add several hundred additional lines of code to their training script to accommodate training on multiple datasets at once, and to incorporate MGDA and domain generalization.

Can you pull the latest into your clone/fork of the repo? There shouldn't be any breaking changes.

luocmin commented 4 years ago

@johnwlambert ，My question is why the code I downloaded does not have some configuration files such as training.md. Sorry, as a novice, if I want to train, besides referring to semseg and MGDA, what other issues should I pay attention to?

luocmin commented 4 years ago

@johnwlambert ，The problem with different codes I found that there are different branches in a warehouse. I see ccsa_train.py in the code domain_generalization. Is this script used for training? Because you didn’t describe how to train in the project description. Can you give a command to execute the training script? Thank you Regarding the data set processing, the tests are very detailed, but there is no training. As a novice, I have the data now, but I can't start the training. Can you give me some pointers?

johnwlambert commented 4 years ago

train.py is used for training models on individual datasets and MSeg using traditional cross-entropy loss and MGDA. ccsa_train.py should be used to train CCSA (domain generalization) models.

Could you let me know more about which model you would like to train (on which datasets, using which taxonomy, using which resolution, and which training technique) and I can point you to the relevant config? There are admittedly a ton of experiment config files since we released dozens of models.

luocmin commented 4 years ago

I'm basically trying to solve the training of lane markings by trying to think of you as being able to train multiple data sets to achieve generalization. At present, I have remAP,relabel the relevant data set according to the MSEG-API provided by you. But I have a question about the difference between data set remap and relabel. Since the code involved is a bit confusing, I would like to ask if the train.py you are referring to is mseg_semantic/tool/train.py or semiseg/tool/train.py

luocmin commented 4 years ago

The elements framed in red are not in the file path pointed to by it.

luocmin commented 4 years ago

johnwlambert commented 4 years ago

I see, thanks for the explanation. Adding lane markings to the universal taxonomy is an interesting experiment and could be quite valuable for self-driving applications. We excluded it from the universal taxonomy since it didn't adhere to the principles in our decision tree (from our paper, since they are marked as "road" in Cityscapes, BDD, IDD, COCO, ADE20K etc). I'm interested to hear what you discover.

The train.py from our repo is what you should use, since it merges multiple datasets at training time using our TaxonomyConverter class.

The relevant config is 480p: config/train/480_release/mseg-3m.yaml 720p: config/train/720_release/mseg-3m.yaml 1080p, (3 million crops): mseg_semantic/config/train/1080_release/mseg-lowres-3m.yaml 1080p, (1 million crops): mseg_semantic/config/train/1080_release/mseg-lowres.yaml

Have you downloaded all the datasets as described here, and do the unit tests pass successfully at the end?

luocmin commented 4 years ago

I didn’t find anything. I studied your paper because I set up to do illegal identification of vehicles on the road, so I need to identify the types of lane lines: double yellow solid lines, zebra crossings, bus lanes, stop lines Wait. So can the author give me some suggestions? Will the weights obtained from your code training help me?
I have completed the download of the data set step by step in accordance with the requirements in mseg-api/download_scripts/README.md-unzip-remap-relabel-verify path-verify relabel (only ScanNet in the test set has not been downloaded, because I The application has not been approved yet and cannot be downloaded. As for the training set, the 7 data sets have been successfully processed)
Why do the following problems occur?

johnwlambert commented 4 years ago

I just updated https://github.com/mseg-dataset/mseg-semantic/blob/master/training.md.

Can you send me the exact commands you are running?

If the following script were named tool/train-qvga-mix-copy.sh, you would call it as:

tool/train-qvga-mix-copy.sh 1080_release/mseg-lowres-3m.yaml False exp ${WORK}/copies/final_train/1080_release/mseg-lowres-3m

#!/bin/sh
PYTHON=/home/anaconda3/envs/pth13/bin/python

config=config/final_train/$1
use_mgda=$2
exp_name=$3

new_folder=$4
mkdir -p ${new_folder}
cp -r config consistency dataset lib model multiobjective_opt pba_utils taxonomy tool util vis_utils ${new_folder}
cp taxonomy* ${new_folder} 
cd ${new_folder}
echo 'CD into the destination folder'

exp_dir=${exp_name}
model_dir=${exp_dir}/model
result_dir=${exp_dir}/result

now=$(date +"%Y%m%d_%H%M%S")

mkdir -p ${model_dir} ${result_dir}

export PYTHONPATH=./

$PYTHON -u tool/train.py \
  --config=${config} use_mgda ${use_mgda} save_path ${model_dir} auto_resume ${model_dir} \
  2>&1 | tee ${model_dir}/train-$now.log

luocmin commented 4 years ago

I am running directly in pycharm right click, no command

luocmin commented 4 years ago

I studied your paper because I set up to do illegal identification of vehicles on the road, so I need to identify the types of lane lines: double yellow solid lines, zebra crossings, bus lanes, stop lines Wait. So can the author give me some suggestions? Will the weights obtained from your code training help me?

According to your prompt, I found that the specific lane types on the road are not distinguished, but they are all recognized as road, but I want to train through the methods provided in your paper to obtain pre-training weights, and then use them and exclusively for In the data set of the lane recognition network that trains itself in the marked category, is this method feasible? Because I don’t have enough time now, I would like to ask the author to help me answer it, thank you

luocmin commented 4 years ago

Author, I see your code, I am very confused, unable to start, feel that I am too stupid

luocmin commented 4 years ago

What does tax_version: 4.0 refer to?

johnwlambert commented 4 years ago

Hi @luocmin, I think you could use your mseg-3m model as a starting point, and replace the final few layers with an expanded taxonomy or just your classes of interest. Then you could fine-tune on Mapillary for your desired classes. Alternatively, you could train from scratch with the expanded taxonomy.

You will need to pass the arguments I mentioned above via command line

python -u tool/train.py  --config=${config} use_mgda ${use_mgda} save_path ${model_dir} auto_resume ${model_dir}

luocmin commented 4 years ago

1、I have seen the configuration file. One data set USES one GPU for multi-data set training, but I don't have enough hardware resources. I only have four Gpus at most，Can I train？ 2、The purpose of using the MGDA is unclear 3、Does save_path refer to the path saved by the weights after training 4、Auto_resume refers to the weight of breakpoint training, or the mseg-3m.pth provided by the author

johnwlambert commented 4 years ago

Hi @luocmin , please find answers to your questions below:

Sure, you can still train by setting to the batch size to a smaller number, or by training at a lower input resolution (smaller input crops, see the 480p or 720p configs instead of 1080p config). How many/which datasets are you training on?
Please refer to the section "Algorithms for learning from multiple domains" from our paper. In our ablation experiments, we found that training with MGDA does not lead to the best model, so we set it to false when training our best models.
save_path is the directory where the model checkpoints and results will be saved. See here.
We use the auto_resume config parameter to allow one to continue training if training is interrupted due to a scheduler compute time limit or hardware error. You could also use it to fine-tune a model.

luocmin commented 4 years ago

1、1. Like the author’s data set, the data set is processed according to the author’s method. I haven’t started training yet. I consulted the teacher today. The purpose of studying the author’s code is to learn the roadside environment, such as Railings, buildings, etc. As for lane recognition, he said that because the author’s weight is too large and inappropriate, I need to train a small weight

luocmin commented 4 years ago

1、Which resolution would you like to train at? (480p, 720p, or 1080p)----480p 2 、Which datasets would you like to train on?----- (all of relabeled MSeg) 3、In which taxonomy (output space) would you like to train the model to make predictions? I don’t quite understand there are several kinds of taxonomy, so I don’t know how to choose

luocmin commented 4 years ago

May I ask the txt file in mseg-api/mseg/dataset_lists/, does the author provide a script to generate the txt file of the data set path in the code?

luocmin commented 4 years ago

What I understand by these lines of code is: 7 cards and one card correspond to one data set, but I don’t have 7 cards, only 4, can I only use 4 data sets for mixed training?

luocmin commented 4 years ago

Where does this function come from?

johnwlambert commented 4 years ago

Thanks for catching this, that was a deprecated name -- should be ToUniversalLabel, not ToFlatLabel see here. I've updated the training script to reflect this.

johnwlambert commented 4 years ago

If you only have 4 cards, instead of 7, and you still want to train on all 7 datasets, you will need to re-write some training logic. We make the assumption that a user would have at least 7 cards.

Instead of running each iteration over samples from the 7 datasets (7 dataloaders, one in each process), you could run 1 training iteration with 4 datasets, then another training iteration with the other 3 datasets, etc. Alternatively, you could concat all training images into 1 dataloader, and then shard that across the 4 gpus in DistributedDataParallel. Either way, you will need to re-write some code.

johnwlambert commented 4 years ago

Regarding the scripts to generate the txt files in mseg-api/mseg/dataset_lists/ -- these should all be generated already for every MSeg training and test dataset. Are you adding paths for a new different dataset?

FuNian788 commented 2 years ago

@luocmin I may understand your problem, you just downloaded codes from branch 'master', but the training scripts and configs are in branch 'train'. Just change the branch and you can train whole model.

mseg-dataset / mseg-semantic

Train code #12