rwth-i6 / i6_core

Sisyphus recipies for ASR
Mozilla Public License 2.0
16 stars 23 forks source link

Add `PlotAlignmentsJob` #528

Closed Icemole closed 2 months ago

Icemole commented 3 months ago

The clip isn't beneficial when one wants to see the plot as a whole.

Edit: the description above is obsolete. The new PR description is: added job to visualize the alignment scores.

michelwi commented 3 months ago

But without the clip, the plots can be illegible.. e.g. if all relevant data is between 0 and 10 and there is one outlier at 10^5, then one would not see anything in the resulting plot.

Icemole commented 3 months ago

... which is what just happened to me. Great. I'm not sure what to do here. Clip by 3 * stddev? 99th percentile? Allow user customization of the cut? The point is that I would like to see most of the plot, but the 90th percentile isn't enough for my needs.

Icemole commented 3 months ago

Actually clipping by median + 3 * stddev would be better than clipping by mean + 3 * stddev.

Icemole commented 3 months ago

I'm thinking of adding a PlotAlignmentsJob that allows for more customizability than a task in the AlignmentJob.

michelwi commented 3 months ago

I don't think there is a general rule that we can come up with that would work with all data and all peoples preferences.

Allow user customization of the cut?

maybe this.

Icemole commented 3 months ago

Should we remove this parameter from the hashes? I don't want to modify the parameter and then having to rerun the whole job. Is there any way to only have to run a given task after a parameter change?

michelwi commented 3 months ago

Should we remove this parameter from the hashes?

it changes the outputs of the job.. 🤷

I don't want to modify the parameter and then having to rerun the whole job.

yes, but you have to, if you want the parameter to take effect

Is there any way to only have to run a given task after a parameter change?

I don't think so, the hash is on job-level

Icemole commented 3 months ago

I wouldn't like having to align everything every time I want to slightly modify the resulting plot. I think the best course of action involves adding a new job PlotAlignmentsJob which would be highly customizable. I'll try to do this.

sarahberanek commented 3 months ago

In my opinion one should always look at all data - probably in log - to see the outliers and then zoom into to the region of interest. So I would never think that one plot is enough.

michelwi commented 3 months ago

Icemole requested a review from michelwi 3 days ago

From my last review there is still open to revert the two changes in the existing code. This is probably also why the black step in the pipeline fails. Otherwise lgtm.

Icemole commented 3 months ago

Icemole requested a review from michelwi https://github.com/rwth-i6/i6_core/pull/528#event-13742469432

From my last review there is still open to revert the two changes in the existing code. This is probably also why the black step in the pipeline fails. Otherwise lgtm.

I changed it because I saw it was a line 150 characters long, which is over the 120 character threshold. Let me know if you still want me to revert it to its original state.