robhagemans / pcbasic

PC-BASIC - A free, cross-platform emulator for the GW-BASIC family of interpreters
http://www.pc-basic.org
Other
393 stars 48 forks source link

CTRL+Z in files causes an EOF input past end #172

Closed goldnchild closed 2 years ago

goldnchild commented 2 years ago

Bug report

Problem

Trying to read a file will fail once it reads in a CTRL+Z (ascii 0x1a character).

I was trying to read a hexadecimal file and print it out, but it would "truncate" the file once it hit a 0x1a.

Steps

Program

open "pcbasictest.txt" for output as #2
print #2,chr$(26)
close #2
 
open "pcbasictest.txt" for input as #3
a$=input$(1,#3)
Input past end 
Ok 

Crash log

Notes

PC-BASIC version: 2.0.2 Operating system version: ubuntu

Marrin commented 2 years ago

That's not a bug. GW-BASIC has the exact same feature.

GW-BASIC 3.23
(C) Copyright Microsoft 1983,1984,1985,1986,1987,1988
60300 Bytes free                                     
Ok                                                  
open "test.txt" for output as #2                     
Ok                                                  
print #2,chr$(26)                                    
Ok                                                  
close #2                                             
Ok                                                  
open "test.txt" for input as #3                      
Ok                                                  
a$=input$(1,#3)                                      
Input past end                                      
Ok   
Marrin commented 2 years ago

BTW this isn't a bug in GW-BASIC either but the way DOS handles text mode files. If you want to read past byte value 26 in a file it needs to be opened in binary mode. AFAIK the only way in GW-BASIC to do that is random access files.

To print all the byte values in the text file you created above, this can be used:

list
10 OPEN "test.txt" FOR RANDOM AS #1 LEN=1:A$=SPACE$(1):FIELD #1,1 AS A$
20 FOR I=1 TO LOF(1):GET #1,I:PRINT ASC(A$);:NEXT:PRINT:CLOSE 1
Ok
run
 26  13  10  26 
Ok 

As you see your PRINT #2,CHR$(26) didn't just write a byte with value 26 but also byte values 13 and 10 to end the line. And the CLOSE then added a byte value 26 to signal the end of the text file.

This 26 thing is a remnant from earlier operating systems like CP/M which didn't have an exact file size but measured the file size in disk blocks. Usually in blocks of 128 bytes. And with text files there was no way to know where in the last block of the file the text ends, so byte value 26/Ctrl+Z was used as end marker.