dtcooper / python-fitparse

Python library to parse ANT/Garmin .FIT files
http://pythonhosted.org/fitparse/
MIT License
738 stars 184 forks source link

Feature: GPX output for fitdump #108

Closed emirpnet closed 3 years ago

emirpnet commented 4 years ago

Basic GPX output for fitdump (at this point without the Garmin-extension of the GPX specs, so all non-GPS data like heartrate, cadence, etc. is discarded)

pR0Ps commented 3 years ago

@emirpnet Thanks for this! I made some changes to it to allow for generating the GPX output without having to load the whole input file into memory or process it twice. You can see them at https://github.com/dtcooper/python-fitparse/tree/feature/gpx . Do you see any issues with it or things you'd like changed?

emirpnet commented 3 years ago

@pR0Ps I'd be happy for the feature to be merged. Personally I use it a lot.

Just some questions, as I don't understand the control flow here (from line 108):

# file creation time (if a file_id record exists)
    first_record = []
    for message in records:
        if message.name == "file_id":
            for field_data in message:
                if field_data.name == "time_created" and type(field_data.value) == datetime.datetime:
                    yield '  <time>{}</time>\n'.format(field_data.value.strftime(GPX_TIME_FMT))
                    break
            else:
                # No time found in the fields, check next record
                continue
            break
        elif message.name == "record":
            first_record.append(message)
            break

First of all I'm not sure where the else: continue statements in lines 116-118 belong to. I think the indentation is off and/or it might be redundant.

But more importantly I'm not sure I understand the break statements at the end of the if and elif block correctly. So you are looking for the first occurance of either a "file_id" message or "record" message. (Does the "file_id" always come before the first "record" in a FIT file? I have really no knowledge about the FIT file specification.)

And in line 136:

for message in itertools.chain(first_record, records):

the code goes through the first_record and all the other records, right? Doesn't that double the first_record in the output, if there is no "file_id"?

Sorry for all the questions, I'd really like to learn from the code.

pR0Ps commented 3 years ago

You are correct that the loop is looking for the first file_id (with time_created) or record message. According to the FIT specification's best practices, the file_id should always be the first data record so this should work in most (all?) cases.

In Python, for/while loops can have an else section on them that executes if the loop finishes without breaking. In this case if the time_created field is found, it will yield the <time>xxx</time> line and break out of the loop. This skips the else: continue block and executes the break right below it to break out of the outer loop as well. If there was no time_created then the else: continue will be executed, bypassing the break and causing the outer loop to continue, checking the next record.

Since records is an iterator, iterating over it again won't start from the beginning, but from where it previously left off. This means that the if a record-type message was seen while looking for the file_id message, it needs to be saved somewhere so it can actually be processed later. This is why it's added to the first_record list (note that this list is always either empty or only has 1 record in it - it's a list only to make iteration over it easier).

The later for message in itertools.chain(first_record, records) loop will first process the record-type message if one was seen while looking for the time before processing the rest of the messages.

Hopefully that answered all your questions. I agree that the control flow is a bit weird. I'm not super happy with it but I couldn't think of a cleaner way to do it in a single pass without resorting to something like making the iterator peekable which seemed like overkill. I'm open to suggestions though.

emirpnet commented 3 years ago

@pR0Ps Thank you for clearing it up for me! I didn't know about the else statement for loops in Python. (Should have thought about records being an iterator, though.) So thanks again for picking up the pull request!

pR0Ps commented 3 years ago

Merged in 0f4e23e