Closed emirpnet closed 3 years ago
@emirpnet Thanks for this! I made some changes to it to allow for generating the GPX output without having to load the whole input file into memory or process it twice. You can see them at https://github.com/dtcooper/python-fitparse/tree/feature/gpx . Do you see any issues with it or things you'd like changed?
@pR0Ps I'd be happy for the feature to be merged. Personally I use it a lot.
Just some questions, as I don't understand the control flow here (from line 108):
# file creation time (if a file_id record exists)
first_record = []
for message in records:
if message.name == "file_id":
for field_data in message:
if field_data.name == "time_created" and type(field_data.value) == datetime.datetime:
yield ' <time>{}</time>\n'.format(field_data.value.strftime(GPX_TIME_FMT))
break
else:
# No time found in the fields, check next record
continue
break
elif message.name == "record":
first_record.append(message)
break
First of all I'm not sure where the else: continue
statements in lines 116-118 belong to. I think the indentation is off and/or it might be redundant.
But more importantly I'm not sure I understand the break statements at the end of the if and elif block correctly. So you are looking for the first occurance of either a "file_id" message or "record" message. (Does the "file_id" always come before the first "record" in a FIT file? I have really no knowledge about the FIT file specification.)
And in line 136:
for message in itertools.chain(first_record, records):
the code goes through the first_record and all the other records, right? Doesn't that double the first_record in the output, if there is no "file_id"?
Sorry for all the questions, I'd really like to learn from the code.
You are correct that the loop is looking for the first file_id
(with time_created
) or record
message. According to the FIT specification's best practices, the file_id
should always be the first data record so this should work in most (all?) cases.
In Python, for
/while
loops can have an else
section on them that executes if the loop finishes without breaking. In this case if the time_created
field is found, it will yield the <time>xxx</time>
line and break
out of the loop. This skips the else: continue
block and executes the break
right below it to break out of the outer loop as well. If there was no time_created
then the else: continue
will be executed, bypassing the break
and causing the outer loop to continue, checking the next record.
Since records
is an iterator, iterating over it again won't start from the beginning, but from where it previously left off. This means that the if a record
-type message was seen while looking for the file_id
message, it needs to be saved somewhere so it can actually be processed later. This is why it's added to the first_record
list (note that this list is always either empty or only has 1 record in it - it's a list only to make iteration over it easier).
The later for message in itertools.chain(first_record, records)
loop will first process the record
-type message if one was seen while looking for the time before processing the rest of the messages.
Hopefully that answered all your questions. I agree that the control flow is a bit weird. I'm not super happy with it but I couldn't think of a cleaner way to do it in a single pass without resorting to something like making the iterator peekable which seemed like overkill. I'm open to suggestions though.
@pR0Ps Thank you for clearing it up for me! I didn't know about the else statement for loops in Python. (Should have thought about records
being an iterator, though.) So thanks again for picking up the pull request!
Merged in 0f4e23e
Basic GPX output for fitdump (at this point without the Garmin-extension of the GPX specs, so all non-GPS data like heartrate, cadence, etc. is discarded)