spokenlanguage / platalea

Library for training visually-grounded models of spoken language understanding.
Apache License 2.0
3 stars 1 forks source link

75 howto100m preprocessing #86

Closed cwmeijer closed 3 years ago

cwmeijer commented 3 years ago

This will add howto100m preprocessing to platalea.

codecov-commenter commented 3 years ago

Codecov Report

Merging #86 (e404f3c) into master (90cc6de) will decrease coverage by 0.56%. The diff coverage is 95.91%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #86      +/-   ##
==========================================
- Coverage   75.37%   74.80%   -0.57%     
==========================================
  Files          32       38       +6     
  Lines        2160     2532     +372     
==========================================
+ Hits         1628     1894     +266     
- Misses        532      638     +106     
Impacted Files Coverage Δ
platalea/utils/preprocessing.py 47.50% <93.61%> (ø)
tests/platalea/utils/test_preprocessing.py 97.91% <97.91%> (ø)
platalea/experiments/config.py 93.54% <100.00%> (+0.32%) :arrow_up:
platalea/audio/preproc.py 100.00% <0.00%> (ø)
platalea/audio/filters.py 100.00% <0.00%> (ø)
platalea/audio/features.py 100.00% <0.00%> (ø)
platalea/audio/melfreq.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 90cc6de...e404f3c. Read the comment docs.

egpbos commented 3 years ago

Is this ready to merge?

cwmeijer commented 3 years ago

Thanks for reviewing. I'll take it into account. I put this thing back into draft mode though. I'm at the same time working on the data processing (as opposed to this PR about PREprocessing). Many problems that I encounter in the processing, can best be solved by changing the preprocessing. When I think it all makes sense and is functional, I'll ask you (or 1 of you) to have another look at the final changes.

cwmeijer commented 3 years ago

I fixed the issues raised by Patrick some time ago. I also changed many other things. For instance, there's an index file being written that contains some metadata of start/end locations in the memmap and paths to video features. I think it's ready for merge now, so I requested another review.