owncloud / music

:notes: Music app for ownCloud
GNU Affero General Public License v3.0
565 stars 197 forks source link

Metadata not extracted from files on SMB share #670

Open raulsavu opened 5 years ago

raulsavu commented 5 years ago

The Artist name is not read in FLAC files. This means, I see all albums under (Unknown Artist). See the screenshot. screenshot

paulijar commented 5 years ago

Thanks for the report. I have only a few test files in FLAC format but on those the artist information is extracted with no problems. Can you tell anything more about the type of the metadata on your tracks? What do you see if you view details of the tracks in the Music app (hover over track name and press the "i" symbol)?

raulsavu commented 5 years ago

To your question: I see only the path and the track name.

raulsavu commented 5 years ago

I have done some further research. My music collection is stored on a smb share. If I copy the FLACs in a local directory on the server, than everything works correctly. The details I see when I click the i-symbol are complete.

What remains open, is that tags are not read from smb-shares. (Background: The server is a small and fast server at Hetzner and the smb-share is a large "storage box" as they call it. It is connected over the "external share"-plugin from nextcloud)

paulijar commented 5 years ago

Okay, thanks for the info. So the case seems to be that no metadata can be extracted from files on SMB share. The album names, track titles, and track numbers shown probably come from the fallback logic which deduces these from the file and folder names in absence of metadata. The details view shows the metadata exactly as extracted from the file contents, without applying any fallbacks.

This kind of issues have been reported before in #415 and #463 but those reports have been closed as the original reporters were not reached or no longer able to test if the issue affect also the latest versions of the Music app and Nextcloud/ownCloud. Personally, I have not tested SMB shares.

Probably the root cause for this issue is in the Nextcloud core (or its external storage plugin). Individual apps shouldn't need to care about different storage options. More studying would be needed to tell if there is anything we can do on the application side.

raulsavu commented 5 years ago

The stuff with the fallback looks correct to me.

The core, I hope is not much of a problem. I am also using the "Audio Player"-plugin. That plugin can read out tag information.

paulijar commented 5 years ago

I am also using the "Audio Player"-plugin. That plugin can read out tag information.

Okay, that's good to know. So maybe we could find out something by comparing the implementations of Music and Audio Player. Unfortunately I have been rather busy for the past few months and I don't know when would I have time to investigate further. But sooner or later I'll take a look.

paulijar commented 5 years ago

I finally had some time to investigate this. After bit of a struggle, I managed to set up Nextcloud to mount a shared folder from a Win10 PC via SMB. Still, I couldn't reproduce the issue on my system: the metadata of the tracks on SMB share was extracted just fine.

However, I found this related report: https://github.com/nextcloud/server/issues/3691. Apparently, in some system configurations, you cannot use fseek for files on SMB share. For this reason, Getid3 library used by both Audio Player and Music apps cannot extract the metadata. Audio Player has implemented a work-around for this where they make a local copy of each audio file during scanning if the fseek does not work. The downside of this is that making the local copies may be extremely slow and costly as it may cause gigabytes of data transfer on a large collection. Hence, I'm not totally convinced that it's really worth it.

raulsavu commented 5 years ago

Thanks for your investigations. The problem seems to lay a little bit deeper. One of the solutions proposes to install the smblibclient for php. If I do that, then the smb share is not reachable at all. A different solution is to mount the share directly under linux without using the nextcloud interface. But the ubuntu minimal installation has problems reading utf8. So I will have to exchange the kernel.

HaseHarald commented 5 years ago

I'm basically having the same Problem except it's mp3s instead of FLACs. However the "fallback" that reads album and artist information from path and filename does work on about 3% of the collection. (The entire collection is ordered in filenames like this: Artist/Album/Title)

I agree that obtaining a local copy to read the tags is probably not a good idea in general. Although it would help in my situation because I don't mind the traffic over LAN. But would it be a solution to be able to force the app to use the fallback? After all it works when the files are named correctly. Is there even a way to force it to use the fallback already?

paulijar commented 5 years ago

@HaseHarald Yes, it has crossed my mind before, that it might sometimes be useful to be able to force the Music app skip the metadata extraction and use only the file and folder names. I've been mostly thinking about the case of a huge music collection where scanning all the metadata may take hours.

What do you mean when you say that the fallback solution does not work on 3% of you collection? What happens on those tracks and albums, and is there something in common on those tracks where it fails?

HaseHarald commented 5 years ago

No no, you got that the wrong way around. The fallback only works on like 8% of the collection (I just did the math, its about 8% not 3%). The other 92% are all under "Unknown artist". They all have in common that the ID3-Tags are filled with at least Artist and Title and most of the times the Album as well. But they do have that in common with the ones that work as well. Also they all are in the same directory-structure as I mentioned above. They also do have that in common with the ones that work. I look at permissions, timestamps, any wired characters in the names and other stuff like that but could not find anything that would separate those two groups apart.

Today I dumped the cached collection and let the plugin read them new. Now some of the Albums that worked before don't work anymore and others do. I guess it's more about the count. I have 24 Albums that are recognized, and 291 that are not. I don't know the exact numbers before I dumped the cache, but it feels like about the same. So I'm guessing it's more about the numbers?

paulijar commented 5 years ago

@HaseHarald Ah, okay, is misread that. It should be noted that we currently don't have fallback logic for the artist name. So it's an expected results that albums with no accessible metadata are shown under "Unknown artist". The fallback logic is able to deduce only the album name, track title, and track number. If you don't see these either, then something is wrong. But now I wonder, what's up with those 8% of tracks which are not under "Unknown artist". Do those files have extractable metadata after all?

I guess the fallback logic could be extended to use the second level parent folder as artist name, as that Artist/Album/Title folder structure is quite common. On the other hand, this could be somewhat unexpected and can cause weird results for those users who do not use this kind of tree structure.

HaseHarald commented 5 years ago

Hi there. Sorry for the late response, but after all this still is just a personal side project for me.

However in the meantime I had some progress. Because of an issue with a different smb share, where some files where not shown at all, I learned that the php-smbclient module itself is not capable of listening to updates from the samba server. So I installed the regular smbclient as well. This solved the issue of updating file lists and, surprisingly, the readability of ID3-Tags from the music app as well. Now the numbers are quite the other way around. There is only a few albums under "Unknown Artists" and most of the others are recognized perfectly.

What takes me wonder though: Some of the Albums listed under "Unknown Artist" have perfectly readable ID3-Tags. Even in the Music app. Clicking on the little "i" shows all fields "Artist", "Title", "Album" even the track number and release year of the album are filled. They are just not listed under the artist but under "Unknown Artist" for some reason.

As this only affects a few albums I'm fine with that though.

A suggestion on the fallback: Yes, you are right saying that it can lead to problems when someone doesn't follow the Artist/Album/Title structure. On the other hand it would lead to basically the same problems if one doesn't use the Artist/Title structure right now. If you want to circumvent those problems effectively, you have to make the structure configurable. Having Artist/Album/Title as a default would probably make a good starting point though.

paulijar commented 5 years ago

Some of the Albums listed under "Unknown Artist" have perfectly readable ID3-Tags. Even in the Music app. Clicking on the little "i" shows all fields "Artist", "Title", "Album" even the track number and release year of the album are filled. They are just not listed under the artist but under "Unknown Artist" for some reason.

Hmm, maybe the metadata extraction for those files has somehow failed during the scanning although now it works. The data shown in the Details view does not come from the database built during the scanning, but it's read directly from the file when you click the "i" button. You could try to force rescan for one of those problem files by renaming the file deleting and then restoring the file, and see what happens.

npodbielski commented 5 years ago

I also have music mounted via smb and it works fine for me.

magikmw commented 4 years ago

I have a similiar issue, however my files sit on a S3 compatible storage. AudioPlayer doesn't have issues reading tags similiar to this: https://github.com/owncloud/music/issues/670#issuecomment-438069832

paulijar commented 3 years ago

I guess the fallback logic could be extended to use the second level parent folder as artist name, as that Artist/Album/Title folder structure is quite common.

For the record, this improvement of the fallback logic was finally introduced earlier this month in Music v1.3.2.