microsoft / VideoX

VideoX: a collection of video cross-modal models
Other
967 stars 160 forks source link

X-CLIP's dataset class error #86

Closed xqfJohn closed 1 year ago

xqfJohn commented 1 year ago

Hi, one bug has been occurred when I tried to run the X-CLIP code: data[self.meta_name] = DC(meta, cpu_only=True)

the function DC is not defined and has no document.

Could you please explain it? Thank you so much~

`@PIPELINES.register_module() class Collect: """Collect data from the loader relevant to the specific task.

This keeps the items in ``keys`` as it is, and collect items in
``meta_keys`` into a meta item called ``meta_name``.This is usually
the last stage of the data loader pipeline.
For example, when keys='imgs', meta_keys=('filename', 'label',
'original_shape'), meta_name='img_metas', the results will be a dict with
keys 'imgs' and 'img_metas', where 'img_metas' is a DataContainer of
another dict with keys 'filename', 'label', 'original_shape'.

Args:
    keys (Sequence[str]): Required keys to be collected.
    meta_name (str): The name of the key that contains meta infomation.
        This key is always populated. Default: "img_metas".
    meta_keys (Sequence[str]): Keys that are collected under meta_name.
        The contents of the ``meta_name`` dictionary depends on
        ``meta_keys``.
        By default this includes:

        - "filename": path to the image file
        - "label": label of the image file
        - "original_shape": original shape of the image as a tuple
            (h, w, c)
        - "img_shape": shape of the image input to the network as a tuple
            (h, w, c).  Note that images may be zero padded on the
            bottom/right, if the batch tensor is larger than this shape.
        - "pad_shape": image shape after padding
        - "flip_direction": a str in ("horiziontal", "vertival") to
            indicate if the image is fliped horizontally or vertically.
        - "img_norm_cfg": a dict of normalization information:
            - mean - per channel mean subtraction
            - std - per channel std divisor
            - to_rgb - bool indicating if bgr was converted to rgb
    nested (bool): If set as True, will apply data[x] = [data[x]] to all
        items in data. The arg is added for compatibility. Default: False.
"""

def __init__(self,
             keys,
             meta_keys=('filename', 'label', 'original_shape', 'img_shape',
                        'pad_shape', 'flip_direction', 'img_norm_cfg'),
             meta_name='img_metas',
             nested=False):
    self.keys = keys
    self.meta_keys = meta_keys
    self.meta_name = meta_name
    self.nested = nested

def __call__(self, results):
    """Performs the Collect formating.

    Args:
        results (dict): The resulting dict to be modified and passed
            to the next transform in pipeline.
    """
    data = {}
    for key in self.keys:
        data[key] = results[key]

    if len(self.meta_keys) != 0:
        meta = {}
        for key in self.meta_keys:
            meta[key] = results[key]
        data[self.meta_name] = DC(meta, cpu_only=True)
    if self.nested:
        for k in data:
            data[k] = [data[k]]

    return data`
nbl97 commented 1 year ago

Thanks for your interest, and sorry for the late reply. Did you organize the datasets as the format shown in README?

xqfJohn commented 1 year ago

Yes,I prepared all the data following the README file.