MTG / gaia

C++ library to apply similarity measures and classifications on the results of audio analysis, including Python bindings. Together with Essentia it can be used to compute high-level descriptions of music.
http://essentia.upf.edu
GNU Affero General Public License v3.0
272 stars 66 forks source link

json_to_sig NOT eliminating metadata attributes #79

Closed derekacosta closed 4 years ago

derekacosta commented 6 years ago

heading into gaia/src/bindings/pygaia/scripts/classification/json_to_sig.py you can find that the code parses through the json and drops into yaml; however, the output produces is the same as the input, because the for loop which is iterating through the file is failing to pick out any attributes.

The more appropriate course would be to simply remove the for loop and in place paste the conditions within it as so

def convertJsonToSig(filelist_file, result_filelist_file):
    fl = yaml.load(open(filelist_file, 'r'))
    result_fl = fl
    errors = []

    #for trackid, json_file in fl.iteritems():
            #data = json.load(open(json_file))

            # remove descriptors, that will otherwise break gaia_fusion due to incompatibility of layouts
            #if 'tags' in data['metadata']:
            #    del data['metadata']['tags']
            #if 'sample_rate' in data['metadata']['audio_properties']:
            #    del data['metadata']['audio_properties']['sample_rate']
            #if 'lossless' in data['metadata']['audio_properties']:
            #    del data['metadata']['audio_properties']['lossless']
        #data = json.dumps(json_file)
        #data = json.loads(data)
        #for k,v in data.iteritems():
        #    print k
        #print data['tags']
        #sig_file = os.path.splitext(json_file)[0] + '.sig'
        #yaml.dump(data, open(sig_file, 'w'))
        #result_fl[trackid] = sig_file
    if 'tags' in result_fl['metadata']:
        del result_fl['metadata']['tags']
    if 'sample_rate' in result_fl['metadata']['audio_properties']:
        del result_fl['metadata']['audio_properties']['sample_rate']
    if 'lossless' in result_fl['metadata']['audio_properties']:
        del result_fl['metadata']['audio_properties']['lossless']
    yaml.dump(result_fl, open(result_filelist_file[:-5] + '.sig', 'w'))

    #print "Failed to convert", len(errors), "files:"
    #for e in errors:
    #    print e

    #return len(errors) == 0

Out of interest for anyone reading this I am also running this command find . -name '*.json' -exec sudo python2 ../../../gaia/src/bindings/pygaia/scripts/classification/json_to_sig.py {} {} \; which traverses through directories of json file and converts them into sig files.

dbogdanov commented 6 years ago

Can you elaborate more on what you want to achieve? I have a trouble understanding the problem from your current description. From what I see in your suggested code, you are treating the filelist_file as if it was a descriptor file, which is not.

alastair commented 4 years ago

This is fixed with the changes in #86, which remove metadata tags as part of the db building process.