maxboels commented 3 years ago

:bug: Bug

Hello,

I'm trying to run "python train.py -c modules/cnn/config/config_feature_extract.yml" but get some errors. I tried to get around some of them, but this 'self.log()' error seems very deep within the "pytorch-lightning==0.8.5" version installed.

def validation_step(self, batch, batch_idx):
        x, y_phase, y_tool = batch
        _, p_phase, p_tool = self.forward(x)
        loss = self.loss_phase_tool(p_phase, p_tool, y_phase, y_tool, self.num_tasks)
        # acc_phase, acc_tool, loss
        if self.num_tasks == 2:
            self.val_acc_tool(p_tool, torch.stack(y_tool, dim=1))
            self.log("val_acc_tool", self.val_acc_tool, on_epoch=True, on_step=False)
            self.val_f1_tool(p_tool, torch.stack(y_tool, dim=1))
            self.log("val_f1_tool", self.val_f1_tool, on_epoch=True, on_step=False)
        print('p_phase tensor with batche_size, num_classes: ', p_phase)
        print('y_phase tensor with batch_size: ', y_phase)
        self.val_acc_phase(p_phase, y_phase)
        self.log("val_acc_phase", self.val_acc_phase, on_epoch=True, on_step=False)
        self.log("val_loss", loss, prog_bar=True, logger=True, on_epoch=True, on_step=False)

Expected behaviour

self.log() should be an attribute of the FeatureExtraction class.

Environment

* CUDA:
        - GPU:
        - available:         False
        - version:           10.2
* Packages:
        - numpy:             1.17.4
        - pyTorch_debug:     False
        - pyTorch_version:   1.9.0+cu102
        - pytorch-lightning: 0.8.5
        - tqdm:              4.62.1
* System:
        - OS:                Linux
        - architecture:
                - 64bit
                - ELF
        - processor:         x86_64
        - python:            3.8.10
        - version:           #488-Microsoft Mon Sep 01 13:43:00 PST 2020

leonmayer commented 1 year ago

I got it to work by installing pytorch-lightning=1.1.8 and changing pl.metrics.Fbeta to pl.metrics.FBeta in feature_extraction.py.

OhSungChoo commented 11 months ago

@leonmayer I come up with these errors

########################ArgParseSummaryEnd######################## Output path: /local_datasets/cholec80/output/231110-174404_FeatureExtraction_Cholec80FeatureExtract_cnn_TwoHeadResNet50Model test extract enabled. Test will be used to extract the videos (testset = all) Subsampling(factor: 25) data: 25fps > 1.0fps train: 2157640 > 86306 val: 535933 > 21438 Subsampling(factor: 25) data: 25fps > 1.0fps test: 4612530 > 184502 Traceback (most recent call last): File "/data/obama0404/repos/TeCNO/train.py", line 136, in train(hparams, ModuleClass, ModelClass, DatasetClass, loggers) File "/data/obama0404/repos/TeCNO/train.py", line 56, in train trainer = Trainer( File "/data/obama0404/anaconda3/envs/tecno/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/env_vars_connector.py", line 41, in overwrite_by_env_vars return fn(self, **kwargs) File "/data/obama0404/anaconda3/envs/tecno/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 348, in init self.accelerator_connector.on_trainer_init( File "/data/obama0404/anaconda3/envs/tecno/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator_connector.py", line 104, in on_trainer_init self.trainer.data_parallel_device_ids = device_parser.parse_gpu_ids(self.trainer.gpus) File "/data/obama0404/anaconda3/envs/tecno/lib/python3.9/site-packages/pytorch_lightning/utilities/device_parser.py", line 78, in parse_gpu_ids gpus = _sanitize_gpu_ids(gpus) File "/data/obama0404/anaconda3/envs/tecno/lib/python3.9/site-packages/pytorch_lightning/utilities/device_parser.py", line 139, in _sanitize_gpu_ids raise MisconfigurationException(f""" pytorch_lightning.utilities.exceptions.MisconfigurationException: You requested GPUs: [-1] But your machine only has: [0]

my pytorch version is

python 3.9.18 h955ad1f_0
python-dateutil 2.8.2 pypi_0 pypi pytorch 2.0.1 py3.9_cuda11.7_cudnn8.5.0_0 pytorch pytorch-cuda 11.7 h778d358_5 pytorch pytorch-lightning 1.1.8 pypi_0 pypi

why does this error occur?

Do you have any solutions or thoughts?

Thanks.

leonmayer commented 11 months ago

Seems like a problem with the GPUs, what does torch.cuda.is_available() output?

OhSungChoo commented 11 months ago

Hi, it's Oh Sung Choo

Thanks for the reply

I found that wasn't a problem at all

it was all about wandb key login

i had no idea what to do if this error occured

raise UsageError("api_key not configured (no-tty). call " + directive) wandb.errors.UsageError: api_key not configured (no-tty). call wandb.login( key=[your_api_key])

Then I figured it out how to solve this error

i don't know if pytorch_lightning version matters

but i solved it with

wandb.login wadb.init in train.py

Thanks. 2023년 11월 13일 (월) 오후 5:35, leonmayer @.***>님이 작성:

Seems like a problem with the GPUs, what does torch.cuda.is_available() output?

— Reply to this email directly, view it on GitHub https://github.com/tobiascz/TeCNO/issues/11#issuecomment-1807677476, or unsubscribe https://github.com/notifications/unsubscribe-auth/BBUQGCEPZ45UTHQBBPAMX4DYEHLWBAVCNFSM5C3PSKPKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBQG43DONZUG43A . You are receiving this because you commented.Message ID: @.***>

leonmayer commented 11 months ago

That's great to hear! :)

tobiascz / TeCNO

AttributeError: 'FeatureExtraction' object has no attribute 'log' #11

:bug: Bug

Expected behaviour

Environment