Open sadiknina opened 6 years ago
The same error, Anyone solved? (Windows 10 System)
Same error...
Looks like the CTRL+C its aborting not only the training process, but the main thread too.
I solved with an alternative. Instead of using try:except: waiting the process to be aborted by ctrl+c, I changed the code to listen the keyboard.
add the code at the begining of the train.py:
import keyboard
at the end of the function train(), change the try: except: to this:
`try:
last_batch_idx = 0
last_batch_time = time.time()
batch_iter = enumerate(read_batches(batch_size))
for batch_idx, (batch_xs, batch_ys) in batch_iter:
print('batch_idx {}'.format( batch_idx))
#begining of the change
if keyboard.is_pressed('q')
print('salvando em weights.npz')
last_weights = [p.eval() for p in params]
numpy.savez("weights.npz", *last_weights)
return last_weights
#end of the change
do_batch(last_batch_time)
if batch_idx % report_steps == 0:
batch_time = time.time()
if last_batch_idx != batch_idx:
#print("time for 60 batches {}".format(60 * (last_batch_time - batch_time) / (last_batch_idx - batch_idx)))
last_batch_idx = batch_idx
last_batch_time = batch_time
except KeyboardInterrupt:
print('salvando em weights.npz')
last_weights = [p.eval() for p in params]
numpy.savez("weights.npz", *last_weights)
return last_weights`
Now you just need to press Q to stop (I have to hold the Q button to work)
Other alternative is to save the weights each step. Im using the report_step = 50, so each 50 step, the loop save the weight, in order to prevent to lose information after hours running:
`try:
last_batch_idx = 0
last_batch_time = time.time()
batch_iter = enumerate(read_batches(batch_size))
for batch_idx, (batch_xs, batch_ys) in batch_iter:
print('batch_idx {}'.format( batch_idx))
if keyboard.is_pressed('q'):#if key 'q' is pressed
print('salvando em weights.npz')
last_weights = [p.eval() for p in params]
numpy.savez("weights.npz", *last_weights)
return last_weights
do_batch(last_batch_time)
if batch_idx % report_steps == 0:
batch_time = time.time()
if last_batch_idx != batch_idx:
#print("time for 60 batches {}".format(60 * (last_batch_time - batch_time) / (last_batch_idx - batch_idx)))
last_batch_idx = batch_idx
last_batch_time = batch_time
#begining of the change
print('salvando em weights.npz')
last_weights = [p.eval() for p in params]
numpy.savez("weights.npz", *last_weights)
#end of the change
except KeyboardInterrupt:
print('salvando em weights.npz')
last_weights = [p.eval() for p in params]
numpy.savez("weights.npz", *last_weights)
return last_weights`
Good job, it's really helpful@yurinativo
It gives this error : TypeError: do_batch() takes 0 positional arguments but 1 was given
Any help to resolve it ?@5059
This error is shown while trying to save the trained model using Ctrl+C
NA68ECS 0.0 <-> SS80RJD 1.0 VH96FLC 1.0 <-> SS80RJD 1.0 ND78RRR 1.0 <-> SS80RJD 1.0 JO64KPG 0.0 <-> SS80RJD 1.0 IP11HRV 0.0 <-> SS80RJD 1.0 GS66WQA 0.0 <-> SS80RJD 1.0 WW22MFV 0.0 <-> SS80RJD 1.0 AJ69HHA 1.0 <-> SS80RJD 1.0 LO05ACN 0.0 <-> SS80RJD 1.0 RN46IKZ 1.0 <-> SS80RJD 1.0 PD67JOS 1.0 <-> SS80RJD 1.0 CC50LIG 1.0 <-> SS80RJD 1.0 SH22NYA 1.0 <-> SS80RJD 1.0 EJ63ANL 1.0 <-> SS80RJD 1.0 CQ23FZB 0.0 <-> SS80RJD 1.0 VB46CTT 0.0 <-> SS80RJD 1.0 UW07QYK 1.0 <-> SS80RJD 1.0 UY55WEE 1.0 <-> SS80RJD 1.0 YY76RDB 0.0 <-> SS80RJD 1.0 HE67HQI 0.0 <-> SS80RJD 1.0 ZM60MGH 0.0 <-> SS80RJD 1.0 TI57KNR 0.0 <-> SS80RJD 1.0 JM74WYE 0.0 <-> SS80RJD 1.0 XQ06DTI 1.0 <-> SS80RJD 1.0 HZ92TYI 1.0 <-> SS80RJD 1.0 PQ07UOA 1.0 <-> SS80RJD 1.0 YF28WPW 0.0 <-> SS80RJD 1.0 YG80GAG 0.0 <-> SS80RJD 1.0 LB83MTT 0.0 <-> SS80RJD 1.0 SA00DVB 1.0 <-> SS80RJD 1.0 WB27CRE 1.0 <-> SS80RJD 1.0 SP44VKA 0.0 <-> SS80RJD 1.0 NP76UVV 0.0 <-> SS80RJD 1.0 DY99FUI 1.0 <-> SS80RJD 1.0 TM74IZO 1.0 <-> SS80RJD 1.0 FH87OOE 0.0 <-> SS80RJD 1.0 HY33LBK 0.0 <-> SS80RJD 1.0 QB77WNK 1.0 <-> SS80RJD 1.0 IV34HCY 0.0 <-> SS80RJD 1.0 ZK77JAL 0.0 <-> SS80RJD 1.0 QQ57ATV 0.0 <-> SS80RJD 1.0 TB67JJQ 0.0 <-> SS80RJD 1.0 PI62NYX 0.0 <-> SS80RJD 1.0 SI33DHX 1.0 <-> SS80RJD 1.0 AQ50HNO 1.0 <-> SS80RJD 1.0 NQ57FJT 1.0 <-> SS80RJD 1.0 NH13WAC 0.0 <-> SS80RJD 1.0 UA09ZPI 0.0 <-> SS80RJD 1.0 TP98FJQ 0.0 <-> SS80RJD 1.0 IO89BAM 1.0 <-> SS80RJD 1.0 B 40 0.00% 44.00% loss: -23851382784.0 (digits: 1750573696.0, presence: -25601955840.0) |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX| time for 60 batches 141.08570551872253 forrtl: error (200): program aborting due to control-C event Image PC Routine Line Source libifcoremd.dll 00007FFD987A94C4 Unknown Unknown Unknown KERNELBASE.dll 00007FFDECA97EDD Unknown Unknown Unknown KERNEL32.DLL 00007FFDEEFF1FE4 Unknown Unknown Unknown ntdll.dll 00007FFDEFBBEFB1 Unknown Unknown Unknown
Any one else had this issue and solved it? Please help