Closed Liontooth closed 7 years ago
Should be easy to fix. GSoC qualification: Solving this issue gives 2 points.
Could I have samples please? I'll give this a shot.
You can probably use any of the teletext ones from here:
http://ccextractor.org/doku.php?id=public:general:tvsamples
On Fri, Jan 27, 2017 at 6:55 PM, Barun Parruck notifications@github.com wrote:
Could I have samples please? I'll give this a shot.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CCExtractor/ccextractor/issues/667#issuecomment-275822533, or mute the thread https://github.com/notifications/unsubscribe-auth/AFrJ2Sh87tabtkfXcYBDD-0X91mBN_enks5rWq42gaJpZM4LtJvs .
I can't seem to reproduce this. Is there a particular sample that you noticed this on?
Confirmed. I'll let GSoC applicants give it a go though since it's not too hard.
Also,in this case (teletext) when extracting from bin it says No captions were found in input.
and yield return code 10 even when they are extracted properly.
Please send fix for that :-)
On Tue, Feb 21, 2017 at 10:36 AM, Saurabh Shrivastava < notifications@github.com> wrote:
Also,in this case (teletext) when extracting from bin it says No captions were found in input. and yield return code 10 even when they are extracted properly.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CCExtractor/ccextractor/issues/667#issuecomment-281435381, or mute the thread https://github.com/notifications/unsubscribe-auth/AFrJ2egfJd4Ptt7fLB30GUAoGF_ELDr_ks5rey6mgaJpZM4LtJvs .
@Liontooth, can you post here the DVB transport stream?
@cfsmp3 @Liontooth While fixing, I am facing timing issues - I mean this :
From TS :
20170123191246.080|20170123191249.060|801|<font color="#00ffff">Disappearing? Are you sure, Mofy</font>
20170123191249.160|20170123191253.020|801|I've just seen it! I couldn't believe my eyes.
20170123191253.120|20170123191255.260|801|Mogu, your bag!
From .bin
20170123191245.980|20170123191248.960|801|<font color="#00ffff">Disappearing? Are you sure, Mofy?</font>
20170123191249.060|20170123191252.920|801|I've just seen it! I couldn't believe my eyes.
20170123191253.020|20170123191255.160|801|Mogu, your bag!
But then I found out that while using .bin few lines are missing too (See https://github.com/CCExtractor/ccextractor/issues/699 ).
Since timings are correct when extracted without -unixts
, it must be something wrong at my part. I am trying to fix it. :)
I was unnecessarily calculating deltas and all which had mistake somewhere. The solution was staring right in the face :P Timing is correct now (in the PR #700 ).
Create a bin file from a DVB transport stream:
ccextractor -ts -pn $PN -out=bin -o $FIL.bin $DIR/$FIL.$EXT
Extracting the text from this bin file:
ccextractor -in=bin -pn 53007 -tpage 891 -datets -ttxt -UCLA -noru -utf8 -parsepat -parsepmt -unixts 1485198721 -o 2017-01-23_1912_FR_TV5_Géopolitis.ccx.out 2017-01-23_1912_FR_TV5_Géopolitis.bin
results in wrong timestamps, a messed up third field, and an extra |:
19700101000109.360|19700101000112.520|CC?||Bonjour, bienvenue dans cette edition de Geopolitis.
while extraction from the transport stream produces the correct output:
20170123191310.360|20170123191313.520|891|Bonjour, bienvenue dans cette edition de Geopolitis.
Let me know if you need samples; this likely holds for any file.