FPS value error when setting include_pose = 'holistic'

areeb-agha commented 1 year ago

Hello When I run the code of RWTH Phoenix 2014 T on Colab, and set include_pose='holistic', I get following error:

InvalidArgumentError                      Traceback (most recent call last)
[<ipython-input-4-365830cd982d>](https://localhost:8080/#) in <cell line: 4>()
      2 rwth_phoenix2014_t = tfds.load(name='rwth_phoenix2014_t', builder_kwargs=dict(config=config))
      3 
----> 4 for datum in itertools.islice(rwth_phoenix2014_t["train"], 0, 10):
      5   print(datum['gloss'].numpy().decode('utf-8'))
      6   print(datum['text'].numpy().decode('utf-8'))

3 frames
[/usr/local/lib/python3.10/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in raise_from_not_ok_status(e, name)
   7260 def raise_from_not_ok_status(e, name):
   7261   e.message += (" name: " + name if name is not None else "")
-> 7262   raise core._status_to_exception(e) from None  # pylint: disable=protected-access
   7263 
   7264 

InvalidArgumentError: {{function_node __wrapped__IteratorGetNext_output_types_7_device_/job:localhost/replica:0/task:0/device:CPU:0}} Feature: pose/fps (data type: int64) is required but could not be found.
     [[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]] [Op:IteratorGetNext]

I have also tried setting different values of fps, but nothing works. If fps value is the cause of error, then what should I do?

AmitMY commented 1 year ago

I'm not sure what exact code you are running, and it seems like you are also doing it via a notebook (not that it supposed to matter).

How much available storage do you have on your machine?

Can you just run a python file with the following:

import tensorflow_datasets as tfds
import sign_language_datasets.datasets
from sign_language_datasets.datasets.config import SignDatasetConfig

import itertools

config = SignDatasetConfig(name="holistic-poses", version="3.0.0", include_video=False, include_pose="holistic")
rwth_phoenix2014_t = tfds.load(name='rwth_phoenix2014_t', builder_kwargs=dict(config=config))

for datum in itertools.islice(rwth_phoenix2014_t["train"], 0, 10):
    print(datum['gloss'].numpy().decode('utf-8'))
    print(datum['text'].numpy().decode('utf-8'))
    print(datum['pose']['data'].shape)
    print()

When I run it, I get:

GUT ABEND LIEB ZUSCHAUER
guten abend liebe zuschauer
(47, 1, 543, 3)

BERG DAZU SCHNEE
im bergland fällt zunehmend schnee
(56, 1, 543, 3)

UNWETTER WEHEN
und der wind weht auch noch kräftig aus west bis nordwest
(70, 1, 543, 3)

WIE-AUSSEHEN IN-KOMMEND MONTAG BIS MITTWOCH WIE-IMMER
die aussichten von montag bis mittwoch ändert sich das wetter kaum
(99, 1, 543, 3)

BERG HOCH KOENNEN QUELL WOLKE NAH REGION KOENNEN VERDICHTEN
über dem bergland können sich einzelne quellwolken zeigen in küstennähe gibt es auch mal dichtere wolken
(123, 1, 543, 3)

NACH MITTAG ZONE ENORM REGEN KOENNEN REGEN BERG DANN SCHNEE SCHNEIEN
am nachmittag im süden auch kräftigere schauer mit graupel im höheren bergland schnee
(120, 1, 543, 3)

EINIGE STELLENWEISE AUFLOCKERUNG TEIL NEBEL
in einigen regionen lockert es auf stellenweise bildet sich nebel
(97, 1, 543, 3)

REGION UNGEFAEHR NULL GRAD NORDWEST neg-HABEN neg-HABEN FROST
auch im osten so um die null grad während im nordwesten es frostfrei bleibt
(114, 1, 543, 3)

TIEF KOMMEN SCHNEE KALT REGEN GLATT
tiefdruckgebiete lenken von nordwesten schnee und gefrierenden regen nach deutschland
(124, 1, 543, 3)

TSCHECHIEN KOMMEN NEBEL
auch von tschechien driften ein paar hochnebelfelder ins land
(82, 1, 543, 3)

I will however warn you against using this dataset. It is over-cleaned (there are no punctuation for example) and other reasons - see https://arxiv.org/pdf/2211.15464.pdf

areeb-agha commented 1 year ago

Yes I ran the same code which you shared on Colab notebook (the link of which you also mentioned on the repository). I am using free version of Google Colab notebook which has disk space of 107 GB.

AmitMY commented 1 year ago

The code in colab does not load holistic, there is include_pose=None. I will need to know the exact code that you ran, OR you can run the code I attached in my previous comment.

areeb-agha commented 1 year ago

Yes I included include_pose="holistic". I want to download the keypoints of the videos, is there anyway I can do it from your code?

AmitMY commented 1 year ago

Can you please respond with

the exact code you are running
what machine are you running it on
the error (if it is not still the same)

For example, I am running:

import tensorflow_datasets as tfds
import sign_language_datasets.datasets
from sign_language_datasets.datasets.config import SignDatasetConfig

import itertools

config = SignDatasetConfig(name="holistic-poses", version="3.0.0", include_video=False, include_pose="holistic")
rwth_phoenix2014_t = tfds.load(name='rwth_phoenix2014_t', builder_kwargs=dict(config=config))

for datum in itertools.islice(rwth_phoenix2014_t["train"], 0, 10):
    print(datum['gloss'].numpy().decode('utf-8'))
    print(datum['text'].numpy().decode('utf-8'))
    print(datum['pose']['data'].shape)
    print()

On a CentOS 7 linux server, and I have no errors.

areeb-agha commented 1 year ago

Yes after running the following code, now I face no error. I don't know why it wasn't running before.

import tensorflow_datasets as tfds
import sign_language_datasets.datasets
from sign_language_datasets.datasets.config import SignDatasetConfig

import itertools

config = SignDatasetConfig(name="holistic-poses", version="3.0.0", include_video=False, include_pose="holistic")
rwth_phoenix2014_t = tfds.load(name='rwth_phoenix2014_t', builder_kwargs=dict(config=config))

for datum in itertools.islice(rwth_phoenix2014_t["train"], 0, 10):
    print(datum['gloss'].numpy().decode('utf-8'))
    print(datum['text'].numpy().decode('utf-8'))
    print(datum['pose']['data'].shape)
    print()

I am running this on colab. Actually what I want is to get the keypoints from the videos of this dataset. So do you have any idea how can I do it?

AmitMY commented 1 year ago

I'm glad it is working for you.

To access the keypoints, you can access datum['pose']['data'].

My recommendation, is opening it as a Pose object:

from sign_language_datasets.datasets.rwth_phoenix2014_t.rwth_phoenix2014_t import _POSE_HEADERS

with open(_POSE_HEADERS["holistic"], "rb") as buffer:
  pose_header = PoseHeader.read(BufferReader(buffer.read()))

fps = int(datum["fps"].numpy())
pose_body = NumPyPoseBody(fps, datum["data"].numpy(), datum["conf"].numpy())

pose = Pose(pose_header, pose_body)

areeb-agha commented 1 year ago

Thanks for providing the code for keypoints extraction. One more problem. When I run this code on my local Jupyter Notebook,

import tensorflow_datasets as tfds
import sign_language_datasets.datasets
from sign_language_datasets.datasets.config import SignDatasetConfig

import itertools

it gives following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18052/921712294.py in <module>
      1 import tensorflow_datasets as tfds
----> 2 import sign_language_datasets.datasets
      3 from sign_language_datasets.datasets.config import SignDatasetConfig
      4 
      5 import itertools

~\Downloads\datasets\sign_language_datasets\datasets\__init__.py in <module>
----> 1 from .aslg_pc12 import AslgPc12
      2 from .asl_lex import AslLex
      3 from .autsl import AUTSL
      4 from .chicagofswild import ChicagoFSWild
      5 from .config import SignDatasetConfig

~\Downloads\datasets\sign_language_datasets\datasets\aslg_pc12\__init__.py in <module>
      1 """aslg_pc12 dataset."""
      2 
----> 3 from .aslg_pc12 import AslgPc12

~\Downloads\datasets\sign_language_datasets\datasets\aslg_pc12\aslg_pc12.py in <module>
     24 
     25 
---> 26 class AslgPc12(tfds.core.GeneratorBasedBuilder):
     27     """DatasetBuilder for aslg_pc12 dataset."""
     28 

AttributeError: module 'tensorflow_datasets' has no attribute 'core'

I think tensorflow_datasets do has attribute 'core'. The code runs perfectly well on google Colab, with no errors. Do you know whats the reason?

AmitMY commented 1 year ago

Please check what is the version of tensorflow datasets on colab and locally

import tensorflow_datasets
print(tensorflow_datasets.__version__)

areeb-agha commented 1 year ago

For accessing the keypoints, I just ran this code on Colab:


from sign_language_datasets.datasets.rwth_phoenix2014_t.rwth_phoenix2014_t import _POSE_HEADERS
from pose_format import Pose, PoseHeader
from pose_format.numpy import NumPyPoseBody
from pose_format.utils.reader import BufferReader
config = SignDatasetConfig(name="holistic-poses", version="3.0.0", include_video=False, include_pose="holistic")
rwth_phoenix2014_t = tfds.load(name='rwth_phoenix2014_t', builder_kwargs=dict(config=config))

import pickle
file_path = "pose_data.pkl"

print("Pose saved to:", file_path)

with open(_POSE_HEADERS["holistic"], "rb") as buffer:
  pose_header = PoseHeader.read(BufferReader(buffer.read()))
for datum in itertools.islice(rwth_phoenix2014_t["train"], None):
  fps = int(datum['pose']["fps"].numpy())
  pose_body = NumPyPoseBody(fps, datum['pose']["data"].numpy(), datum['pose']["conf"].numpy())

  pose = Pose(pose_header, pose_body)
  with open(file_path, "wb") as file:
    pickle.dump(pose, file)

print("Pose saved to:", file_path)

It generated the pickle file. On running:

import pandas as pd
df=pd.read_pickle('/content/pose_data.pkl')
dir(df)

I get the output:

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'bbox',
 'body',
 'focus',
 'frame_dropout_normal',
 'frame_dropout_uniform',
 'get_components',
 'header',
 'normalize',
 'normalize_distribution',
 'pass_through_methods',
 'read',
 'unnormalize_distribution',
 'write']

Where are the keypoints located in this list?

AmitMY commented 1 year ago

Hi! Sorry for the late response.

The correct way to save poses as files is:

with open('name.pose', 'wb') as f:
  pose.write(f)

That makes a .pose file, that you can drag to https://sign.mt to watch, or open via python or javascript.

To your question, in your pickle element, you could access body.data to get the matrix of keypoints.

areeb-agha commented 1 year ago

Thank you so much Mr. Amit. Now I have finally been able to access the keypoints

sign-language-processing / datasets

FPS value error when setting include_pose = 'holistic' #41