Open johndpope opened 1 week ago
does it need it training? there's an error in training code Falsecd https://github.com/Dingpx/EAI/blob/main/train.py#L168
UPDATE - how to define these? rank = int(os.environ["RANK"])
File "<frozen os>", line 679, in __getitem__
KeyError: 'RANK'
UPDATE - found these
```
export RANK=0
export PORT=12345
export LOCAL_RANK=0
export MASTER_ADDR=127.0.0.1
```
UPDATE
i replace the the distributed training with **accelerate**
https://github.com/johndpope/EAI/blob/main/train2.py
it's training...
![Screenshot from 2024-09-10 05-37-28](https://github.com/user-attachments/assets/2121df0c-bf74-4bdf-902b-5a3898ac379c)
how long with how much gpu cluster to train ?
```shell
usage: train.py [-h] [--device DEVICE] [--grab_data_dict GRAB_DATA_DICT] [--exp EXP] [--ckpt CKPT]
[--model_type MODEL_TYPE] [--max_norm] [--linear_size LINEAR_SIZE] [--num_stage NUM_STAGE]
[--num_body NUM_BODY] [--num_lh NUM_LH] [--num_rh NUM_RH] [--lr LR] [--lr_decay LR_DECAY]
[--lr_gamma LR_GAMMA] [--input_n INPUT_N] [--output_n OUTPUT_N] [--all_n ALL_N] [--actions ACTIONS]
[--epochs EPOCHS] [--dropout DROPOUT] [--train_batch TRAIN_BATCH] [--val_batch VAL_BATCH]
[--test_batch TEST_BATCH] [--job JOB] [--seed SEED] [--local_rank LOCAL_RANK] [--W_pg W_PG]
[--W_p W_P] [--is_load] [--is_debug] [--is_exp] [--sample_rate SAMPLE_RATE] [--is_norm_dct]
[--is_norm] [--is_using_saved_file] [--is_hand_norm] [--is_hand_norm_split] [--is_part]
[--part_type PART_TYPE] [--is_boneloss] [--is_weighted_jointloss] [--is_using_noTpose2]
[--is_using_raw] [--J J]
train.py: error: unrecognized arguments: --local-rank=0
```
Sorry, it's been a long time since I last ran this code. It seems that I used 8 A100/V100 to train this project. Regarding the checkpoint, please wait for me a moment as I am busy with my current project, and I will release this checkpoint in a few weeks.
my accelerate code got me by - thanks
Im interested to take 2 body poses - https://github.com/johndpope/EAI/blob/main/pose_vis.py
and using the coco-wholebody - https://github.com/johndpope/EAI/blob/main/test.png interpolate between them (using correct human like joint movement)
my reading is that this codebase could be suitable - did you do any work here? I found more repos doing gesture / fusion - but i just want a sequence of poses - to throw into stable diffusion....
running test.py errors