Open abetusk opened 5 years ago
Sorry for the very long delay, I don't know exactly, the raw
property is extracted from the variable TralbumData
in a <script>
tag.
I looks like each track has its own property license_type
, they all have the value 8
for your example.
I saw there the license information is on the page, so it is possible to extract it and add it to the data set.
<h3 class="license-label">license</h3>
<div id="license" class="info license">
<a class="cc-icons" href="http://creativecommons.org/licenses/by-sa/3.0/" target="_blank">
<span class="attribution"></span>
<span class="share-alike"></span>
</a>
<a href="http://creativecommons.org/licenses/by-sa/3.0/" target="_blank">some rights reserved</a>
</div>
Wow, I completely missed that, thanks.
This most likely "solves" the issue as that's the core information so maybe closing this issue is appropriate.
It would be nice to have a mapping of license_type
to what the actual license is but this can be a separate issue. Is this something you'd be willing to add? Do you have any thoughts on how to get a mapping or if it's stable?
I searched quickly, but I did not find any mapping. Maybe it would be "easier" to scrape it from the HTML? 🤔 What exactly would you like to have? name? version? URL?
Here is what I came up with for the license map:
var license_map = {
"" : { "license_type": "unknown" },
"0" : { "license_type": "unknown" },
"1" : { "license_type": "copyright" },
"2" : { "license_type":"by-nc-nd;3.0" },
"3" : { "license_type":"by-nc-sa;3.0" },
"4" : { "license_type":"by-nc;3.0" },
"5" : { "license_type":"by-nd;3.0" },
"6" : { "license_type":"by:3.0" },
"7" : { },
"8" : { "license_type":"by-sa:3.0" }
};
I (ahem) have some scraped data from Bandcamp and I don't see any reference to license type 7
. Maybe Bandcamp reserved this license type to be sa;3.0
but since essentially no one uses that license and/or Bandcamp doesn't provide it as an option it doesn't show up.
Here are some bands to test to see the above license type map is correct:
1,copyright,https://00000000000000000000.bandcamp.com/track/a
2,by-nc-nd;3.0,https://000-deer.bandcamp.com/track/23-59-s-2
3,by-nc-sa;3.0,https://0099.bandcamp.com/track/wrap-around-yr-dreams
4,by-nc;3.0,https://00raikage.bandcamp.com/track/souless
5,by-nd;3.0,https://01000001lien.bandcamp.com/track/d-n-b-a-s-b-c
6,by;3.0,https://01101001-01100100-01101100.bandcamp.com/track/qus-paradigm
7,sa;3.0,?
8,by-sa;3.0,https://01110.bandcamp.com/track/ignorance
Note, the above list is in no way an endorsement of the bands or any statement about their quality.
All the links that I could find for the Creative Commons licenses from Bandcamp refer to version 3.0 (for example licenses/by-sa/3.0/).
Any chance on getting this license mapping folded into bandcamp-scraper
? I'm happy to make a ticket, issue a pull request, etc. if that's something you're open to.
Maybe the type 7
is not used anymore or is planned to be used in the future, I don't know!
Any chance on getting this license mapping folded into bandcamp-scraper? I'm happy to make a ticket, issue a pull request, etc. if that's something you're open to.
Yes, it would be nice to add it to the scraper! 😄 You can open an issue and create a pull request!
I do not see any license information returned by
getAlbumInfo
and it would be nice to provide that information.For example, running:
returns:
The album is put under a
CC-BY-SA
license but I don't see that reflected in the returned data. I do see alicensed_version_ids
andpackage_associated_license_id
but I'm not sure if that's relevant to the license the album is put under and they're bothnull
in this case.