allthemusicllc / atm-cli

Command line tool for generating and working with MIDI files.
http://allthemusic.info
Other
1.39k stars 107 forks source link

Why enumerate? #8

Closed BartMassey closed 3 years ago

BartMassey commented 4 years ago

From a legal standpoint, what is the advantage of enumerating all the sequences? I think there's a very good "compressed format" for storage of this collection of sequences: given a "melody number" one can quickly produce the sequence with that number from a given input note sequence. Does explicitly representing each sequence compressed much less well really do something legally valuable?

acrisci commented 4 years ago

The melodies much each be fixed in a tangible medium. What you're describing is a procedure for generating melodies which does not give you copyright.

InnovativeInventor commented 4 years ago

From an information theoretic/philosophical perspective, if you have the program to generate the melody, then the information is still preserved in the program and is easily recoverable. There is no theoretical difference between what @BartMassey is suggesting and data compression of the melody files. For that matter, there is no difference between encoding the melodies in different data formats or different compression formats.

The entropy/true information in the melody is fully captured by the program.

See:

I'm not a lawyer and am not qualified to comment on the legal merits/implications of fundamental principles in computability and information theory. However, I highly recommend looking at these topics – they're very interesting!

allthemusicllc commented 4 years ago

@BartMassey it's a good question, and we considered a few different "compression" schemes, though our primary goal was to produce this dataset for submission to the US Copyright Office. As @acrisci points out, the USCO requires that the music must be written to a fixed, tangible medium, hence our choice to actually produce all melodies.

BartMassey commented 4 years ago

K. But you appear to be already producing the "melodies" aggregated and compressed, just using a format (flate2-compressed tarballs) that is inefficient and inappropriate for the problem at hand.

I think you could whip out a compressor in a few hours that worked by sorting the melodies length-primary lexical-secondary, converting each melody to a canonical index, and then run-length encoding the indices. I don't see any philosophical or legal difference between this and compressed tarballs, but it would sure be more efficient if you were adding melodies to the archive in order. Indeed, you could probably find a way to add an aggregation of melodies at one go…

InnovativeInventor commented 4 years ago

If I'm interpreting @allthemusicllc properly (from their talks/stuff online, etc.), it seems like the primary reason is "access".

I've thought about this some more and thought of a few scenarios that explain my thinking.

You could print them out on a traditional music score and distribute them (most accessible). You could also encode them into bits and distribute it electronically. You could efficiently encode it (read: compress) with a popular algorithm and then distribute that electronically (what @allthemusicllc is doing). You could write your own niche/custom compression algorithm and distribute that electronically (what @BartMassey is suggesting). You could write your own extremely efficient compression algorithm (read: generation program) and distribute that electronically. You could write a spec/detailed high level description of the generation program and distribute that spec/description electronically (since anybody should be able to implement the spec). You could write a more abstract/higher level overview of the generation program and distribute that electronically. You could tell people about your idea to write a generation program and the rough idea behind it . . .

Eventually there is a point where the melodies become quite inaccessible. Sure, these are all theoretically "accessible" and all information is technically recoverable/compressed. However, since @allthemusicllc is focused on increasing "access" of the melodies, there must be a limit somewhere between all the examples I brought up ^. The last scenario is ridiculous.

I personally think that an executable generation program should be sufficient, but that's my personal opinion. I'm not a legal expert.

On a side note, the last idea may not be so ridiculous – it serves to highlight the absurd nature of allowing people to copyright such simple melodies.

@BartMassey @allthemusicllc please let me know if I'm completely incoherent or I'm not understanding this properly.

BartMassey commented 4 years ago

Another alternative not yet considered: generate a single melody that contains all n-note melodies as a subset and copyright that. Then any n-note melody infringes on a portion of it.

I suspect that this might be easier to sell to a court, as it constitutes a sort of creative work: some editing was done in deciding in what order to place the notes. On the other hand, the issue of Fair Use becomes slightly harder to argue — "slightly" only because it's already pretty hard. If I'm defending one of these lawsuits (as a non-attorney) I am going to argue (among other things) that my use of "your" melody is Fair Use of an utterly insignifcant and virtually valueless fraction of your overall "composition": this argument is nearly identical whether the composition consists of n tiny files or one big one.

allthemusicllc commented 3 years ago

Due to the lack of conversation on this issue since February, we are going to close it for now. Appreciate the input from everyone, and we hope that you all keep the conversation alive!