HumanSignal / label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format
https://labelstud.io
Apache License 2.0
17.43k stars 2.16k forks source link

Missing data display in time series #1752

Open kennyxue opened 2 years ago

kennyxue commented 2 years ago

Describe the bug Data plot error with the Y axis scale and zoom for time series.

To Reproduce Steps to reproduce the behavior:

  1. setup time series label with data and config file attached
  2. zoom in and out in data display panel
  3. See error. the data line don't display.

Expected behavior fix the data plot error for time series.

Screenshots failure

Environment (please complete the following information):

Additional context attached data and config file config.xml.csv

2020-12-1comb_history.csv

Fourkane commented 2 years ago

Same issue we have when having files with more then 20k data points

https://user-images.githubusercontent.com/55290094/142640603-ace1bda5-7719-4dfa-bba7-c8d8b48629ec.mp4

.

jardinetsouffleton commented 2 years ago

Been having the same problem.

swssl commented 2 years ago

I'm facing the exact same problem on Ubuntu (docker) and Windows 10 (pip installation). Unfortunately, for me the time series editor is the key feature I am really looking forward to try and integrate in my workflow.

jeroenboeye commented 2 years ago

Same thing here, Windows 10, Chrome browser.

michaelhoarau commented 2 years ago

Hi, Same thing here: my time series are not displayed when the number of data points to be displayed is apparently to high. Attached a view where the time series are visible (slider zoom at the bottom set over 3 weeks time) and another where the time series visualization get buggy (when I try to visualize more than 1 month of data). Note that the default zoom is set too large.

view1_normal view2_bug

makseq commented 2 years ago

@michaelhoarau Thank you for your report. Is it possible to share the data and the labeling config to reproduce this issue?

berrnine commented 2 years ago

@makseq

I have a similar issue. I can drag out the x axis and see a complete view of my data, but the label cursor will not appear and the y axis will bug out with time series data containing H:M:S format. It works with just dates though.

Works: 2022/2/7 Doesn't work: 2022/2/7 12:23:42

Buggy Y axis: image image image image image

dataset: total.csv

config: "

makseq commented 2 years ago

Thank you for provided data & config, it helps us a lot!

bmartel commented 2 years ago

I believe this might be a sort order problem with the data imported in the time column. If I take either of the datasets, and order it prior to uploading the data by the time column in either asc or desc, there are no errors and it works entirely as expected. I am going to take a look further into other precondition processes to see if we can handle this automatically without impacting other workflows. In the meantime, can you ensure the imported data is sorted in order of the time column?

bmartel commented 2 years ago

As for datasets which are too large causing an issue, I don't know that this is the same issue necessarily. I'll try to reproduce that, but I would as a first check just ensure the datasets are ordered correctly by time as I am seeing this being an issue in code at this time for handling datasets which are ordered by another column than the timeColumn.

Ouput based on original csv total.csv

Screen Shot 2022-06-08 at 3 52 50 PM

Output based on sorted csv time_y ascending Untitled spreadsheet - total.csv

Screen Shot 2022-06-08 at 3 52 22 PM

berrnine commented 2 years ago

I believe this might be a sort order problem with the data imported in the time column. If I take either of the datasets, and order it prior to uploading the data by the time column in either asc or desc, there are no errors and it works entirely as expected. I am going to take a look further into other precondition processes to see if we can handle this automatically without impacting other workflows. In the meantime, can you ensure the imported data is sorted in order of the time column?

Thank you this worked!

sopeeweje commented 2 years ago

I'm having the same problem even with my data sorted by the time column. Any thoughts?

Data: data.csv

Config:

<View>
    <!-- No region selected section -->
    <View visibleWhen="no-region-selected" style="height:120px">

        <!-- Control tag for region labels -->
        <TimeSeriesLabels name="label" toName="ts">
            <Label value="Region" background="#5b5"/>
        </TimeSeriesLabels>
    </View>

    <!-- Region selected section with choices and rating -->
    <View visibleWhen="region-selected" style="height:120px">
        <!-- Per region Choices  -->
        <Choices name="choices" toName="ts" showInline="true" required="true" perRegion="true">
            <Choice value="Good"/>

        </Choices>
    </View>

    <!-- Object tag for time series data source -->
    <TimeSeries name="ts" valueType="url" value="$csv" sep="," timeColumn="time" timeFormat="%Y-%m-%d %H:%M:%S.%f">
        <Channel column="thing1" strokeColor="#17b" legend="thing1"/>
        <Channel column="thing2" strokeColor="#17b" legend="thing2"/>
        <Channel column="thing3" strokeColor="#17b" legend="thing3"/>
        <Channel column="thing4" strokeColor="#17b" legend="thing4"/>
        <Channel column="thing5" strokeColor="#17b" legend="thing5"/>
        <Channel column="thing6" strokeColor="#17b" legend="thing6"/>
        <Channel column="thing7" strokeColor="#17b" legend="thing7"/>
        <Channel column="thing8" strokeColor="#17b" legend="thing8"/>
        <Channel column="thing9" strokeColor="#17b" legend="thing9"/>
    </TimeSeries>
</View>
swssl commented 2 years ago

Hi, I tried different things concerning this issue an want to share my findings. Used software:

Used Dataset: https://archive.ics.uci.edu/ml/datasets/Appliances+energy+prediction

Used labeling interface:

<View>
    <TimeSeries name="ts" valueType="url" value="$csv"
                sep=";"
                timeColumn="date"
                timeFormat="%Y-%m-%d %H:%M:%S"
                timeDisplayFormat="%Y-%m-%d %H:%M:%S">
      <Channel column="T1"
                 units="unit"
                 displayFormat=",.2f"
                 strokeColor="#ff1122"
                 legend="T1"/>
      <Channel column="Visibility"
                 units="unit"
                 displayFormat=",.2f"
                 strokeColor="#1f77b4"
                 legend="Visibility"/>
      <Channel column="rv1"
                 units="unit"
                 displayFormat=",.2f"
                 strokeColor="#1f77b4"
                 legend="random"/>  
    </TimeSeries>
    <TimeSeriesLabels name="label" toName="ts">
        <Label value="Region" background="red" />
    </TimeSeriesLabels>
    <Choices name="region_type" toName="ts"
          perRegion="true" required="true">
        <Choice value="Outlier"/>
        <Choice value="Anomaly"/>
    </Choices>
</View>

What I found:

  1. When importing the time series Data as-is (deleted " and replace , with ; ), all three channels (T1, Visibility, rv1) are visible for a time period of about 1970 lines of data points (10 minutes period each, see Dataset description). When zooming out, the T1 column behaves like described above - it seems like the scale of the graph shifts, so that just the peaks are visible: Screenshot_20220713_134205 When this happens and I shift the time window to touch the right end of the data, suddenly everything is hidden. This behaviour doesn't depend on: Order of the Channels (T1 is always affected), Colour settings, Number of Channels (I tried 2 and 3)

  2. When copying the data from the working "Visibility" column to the "T1" column, everything works fine, independent of order, colour settings, etc. => I didn't touch (/sort) the "date" column, so is it possible that i read over something in the documentation? Or is there another rule that the data has to fulfill?

- Thanks

pidefrem commented 1 year ago

Hello, I'm facing this problem: when I zoom or select label, the time series often goes out of scale. Should I create another issue for that or is it related?

See the third time series on the first image and the second one on the second plot (starting from the top): the top part is missing.

image

image

harrymander commented 1 year ago

Also seeing this with large segments: everything shows fine for ~38,000 data points, but when I zoom out further than that, the time series are clipped along the top:

image

image

(You can also see in the latter image that the highlight on the summary bar below the plots is wrong: it should cover most of the time series.)

Video:

Screencast from 20-10-22 08:59:36.webm

Datasets are sorted in ascending order by time (left-most column of CSV). Label config below:

<View>
  <Header value="Record version" />
  <Text name="record-version" valueType="text" value="$record_version" />

  <Header value="Original filename" />
  <Text name="original-filename" valueType="text" value="$original_filename" />

  <Header value="Tidal breathing H2S" />

  <Header name="flow-rate-label" value="(Negative flow rate = inhale)" size="6" />

  <TimeSeriesLabels name="tidal-breathing-labels" toName="tidal-breathing">
    <Label value="h2s-baseline"/>
    <Label value="tidal-breathing"/>
  </TimeSeriesLabels>

  <TimeSeries
    name="tidal-breathing"
    valueType="url"
    value="$csv"
    timeColumn="millis"
    overviewChannels="flow,h2s"
  >
    <Channel column="flow" units="L/m" displayFormat=",.1f" legend="Flow rate"></Channel>
    <Channel column="h2s" units="ppm" displayFormat=",.1f" legend="H2S concentration"></Channel>
  </TimeSeries>

  <Header value="Data errors" size="5" />
  <Choices name="errors" toName="tidal-breathing" choice="multiple">
    <Choice value="Flow rate"/>
    <Choice value="H2S"/>
  </Choices>

  <TextArea name="notes" toName="tidal-breathing" placeholder="Notes" />
</View>
bmartel commented 1 year ago

There are currently a couple different issues here, the main issue in which I have a fix under review. We will hopefully have this resolved soon, and I will update this ticket as soon as it's merged.

harrymander commented 1 year ago

Sorry to be that guy, but is there an update on this? 1.7.0 seemed to improve it somewhat but I am still seeing issues especially when there is a large range in the y-axis. Issue seems to be related to weird behaviour with the overview box - box gets bigger when zooming in, disappears etc.

https://user-images.githubusercontent.com/41089556/215891506-01107263-b3da-4521-8d1e-6ee16d02c8a5.mp4

bmartel commented 1 year ago

@harrymander Can you provide the exact config and a sample dataset which was used in the screen recording? This should be fixed on 1.7.0, but seeing this recording looks like it didn't quite fix all of the issues.

hogepodge commented 1 year ago

@harrymander no need to be sorry, we appreciate your patience and persistence. We're working on tightening up the feedback loop from community issues, and are grateful that you're helping make this experience better for everyone.

khadijakhaldi commented 11 months ago

@bmartel I have this config and the date looks like 12/8/2022 4:45:00 AM. I can't read the data and I think its because of the date format. Any ideas please ? `

` [test.csv](https://github.com/heartexlabs/label-studio/files/12131532/test.csv) ![Screenshot 2023-07-20 161748](https://github.com/heartexlabs/label-studio/assets/25855062/8b718f16-8128-47af-9c1f-48555badc195)