What are these characters in the bin file?

I opened the file by 'rb', and the file contains many unconverted characters

with open('/users/cheng/NLP/Data/finished_files/chunked/test_000.bin', 'rb') as file:
    for line in file:
        print(line)

b'R\x1e\x00\x00\x00\x00\x00\x00\n'
b'\xcf<\n'
b'\xf0\x02\n'
b'\x08abstract\x12\xe3\x02\n'
b'\xe0\x02\n'
b"\xdd\x02<s> marseille prosecutor says `` so far no videos were used in the crash investigation '' despite media reports . </s> <s> journalists at bild and paris match are `` very confident '' the video clip is real , an editor says . </s> <s> andreas lubitz had informed his lufthansa training school of an episode of severe depression , airline says . </s>\n"
b'\xd99\n'
b'\x07article\x12\xcd9\n'
b'\xca9\n'

Then I tried to process them by myself. Split the article and abstract and write them to separate file, but here is an error after processing most files:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3131: invalid start byte

How can I get a clean article and abstract from these files?

abisee / cnn-dailymail

What are these characters in the bin file? #20